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© Cloned genes encoding IG-CD4 fusion proteins and the use thereof. 

© Fusion proteins of immunoglobulins of the igM. lgG1 or lgG3 class, wherein the variable region of the light or 
heavy chain has been replaced with CD4 or fragments thereof capable of binding to gpl20 or immunoglobulin- 
like molecules comprising such fusion proteins together with an immunoglobulin light or heavy Chan w be 
administered to an animal suffering from HIV or SIV infection. They also are useful in assays for HIV or SIV 
comprising contacting a sample suspected of containing HIV or SIV gpi20 with the .mmunoglobul.n.l.ke 
molecule or fusion protein, and detecting whether a complex is formed. 
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CLONED GENES ENCODING IG-CD4 FUSION PROTEINS AND THE USE THEREOF 

CROSS-REFERENCE TO RELATED APPLICATION 

This application is a continuation-in-part of U.S. Application Serial No. 07/147.351 filed January 22 
1988. 

5 

FIELD OF THE INVENTION 

The invention is in the field of recombinant genetics. 

w 

BACKGROUND OF THE INVENTION 

The human and simian immunodeficiency viruses HIV and SIV are the causative agents of Acquired 

75 Immune Deficiency Syndrome (AIDS) and Simian Immunodeficiency Syndrome (SIDS), respectively. See 
Curren, J. et ah, Science 329:1359-1357 (1985); Weiss, R. et al. t Nature 324:572-575 (1986). The HIV virus 
contains an envelope glycoprotein, gpl20 which binds to the CD4 protein present on the surface of helper T 
lymphocytes, macrophages and other cells. Dalgleish et al. Nature , 312:763 (1984). After the gpl20 binds to 
CD4, virus entry is facilitated by an envelope-mediated fusion of the viral target cell membranes. 

20 During the course of infection, the host organism develops antibodies against viral proteins, including 
the major envelope glycoproteins gpl20 and gp41. Despite this humoral immunity, the disease progresses, 
resulting in a lethal immunosuppression characterized by multiple opportunistic infections, parasitemia, 
dementia and death. The failure of host anti-viral antibodies to arrest the progression of the disease 
represents one of the most vexing and alarming aspects of the infection, and augurs poorly for vaccination 

25 efforts based upon conventional approaches. 

Two factors may play a role in the inefficacy of the humoral response to immunodeficiency viruses. 
First, like other RNA viruses (and like retroviruses in particular), the immunodeficiency viruses show a high 
mutation rate which allows antigenic variation to progress at a high rate in response to host immune 
surveillance. Second, the envelope glycoproteins themselves are heavily glycosylated molecules presenting 

30 few epitopes suitable for high affinity antibody binding. The poorly antigenic, "moving" target which the viral 
envelope presents, allows the host little opportunity for restricting viral infection by specific antibody 
production. 

Cells infected by the HIV virus express the gpl20 glycoprotein on their surface. Gpl20 mediates fusion 
events among CD4 cells via a reaction similar to that by which the virus enters the uninfected cell, leading 
35 to the formation of short-lived multinucleated giant cells. Syncytium formation is dependent on a direct 
interaction of the gpl20 envelope glycoprotein with the CD4 protein. Dalgleish et al., supra. Klatzmann. D. 
et al.. Nature 312:763 (1984); McDougal. J.S. et aL Science . 231:382 (1986): ^odrosTTT. et al.. Nature. 
322:470 (1986): Lifson. J.D. et aL Nature . 323:725 (T986): SodrolkT J. et al.. Nature. 321:412 (7986). 

The CD4 protein consists of a 370 amino acid extracellular region containing four immunoglobulin-like 
40 domains, a membrane spanning domain, and a charged intracellular region of 40 amino acid residues. 
Maddon, P. et aL, Cell 42:93 (1985): Clark, S. et al., Proc. Natl. Acad. Sci. (USA) 84:1649 (1987). 

Evidence that CD4-gpl20 binding is responsible for viral infection of cells"~bearing the CD4 antigen 
includes the finding that a specific complex is formed between gpl20 and CD4. McDougal et al.. supra. 
Other workers have shown that cell lines, which were non-infective for HIV, were converted to injectable ceil 
45 lines following transfection and expression of the human CD4 cDNA gene. Maddon et al.. Cell 47:333-348 
(1986). ' ~ ~~ = ~~ 

In contrast to the majority of antibody-envelope interactions, the receptor-envelope interaction is 
characterized by a high affinity (K a a 10 8 l/mole) immutable association. Moreover, the affinity of the virus for 
CD4 is at least 3 orders of magnitude higher than the affinity of CD4 for its putative endogenous ligand. the 
so MHC class II antigens. Indeed, to date, a specific physical association between monomenc CD4 and class II 
antigens has not been demonstrated. 

In response to bacterial or other particle infection, the host organism usually produces serum antibodies 
that bind to specific proteins or carbohydrates on the bacterial or particle surface, coating the bacteria. This 
antibody coat on the bacterium or other particle stimulates cytolysis by Fc-receptor-bearing lymphoid cells 
by antibody-dependent cellular toxicity (ADCC). Other serum proteins, collectively called complement (C), 
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bind to antibody-coated targets, and also can coat foreign particles nonspecificaliy. They cause cell death 
by lysis, or stimulate ingestion by binding to specific receptors on the macrophage called complement 
receptors. See Darnell J. et aL. in Molecular Cell Biology . Scientific American Books, pp. 641 and 108, 

<l98 The most effective complement activating classes of human Ig are IgM and IgGi. The complement 
system consists of 14 proteins that, acting in order, cause lysis of cells. Nearly all of the C prote.ns exist .n 
normal serum as inactive precursors. When activated, some become highly specific proteose enzymes 
whose substrate is the next protein in a sequential chain reaction. 

The entire C sequence can be triggered by e.ther of two initiation pathways. In one (the classic 
pathway) Ab-Ag complexes bind and activate C1. C4 and C2 to form a C3-splitting enzyme. In the second 
pathway 'polysaccharides commonly on the surface of many bacteria and fungi bind with trace amounts of 
a C3 fragment and then with two other proteins (factor B and properdin) to form another C3-spl.tt.ng 
enzyme Once C3 is split by either pathway, the way is open for the remaining sequence of steps which 
lead to cell lysis. See Davis. B.D.. et aL. In Microbiology . 3rd ed.. Harper and Row, Philadelphia. PA. pp. 

,5 452 -^ 6 ^ 9 b 8 e ° ) o{ wofkers have disclosec meth0 ds for preparing hybrid proteins. For example. Murphy, 
United States Patent 4.675.382 (1987), discloses the use of recombinant DNA techniques to make hybrid 
protein molecules by forming the desired fused gene coding for a hybrid protein of dipther.a tox.n and a 
polypeptide ligand such as a hormone, followed by expression of the fused gene. 

Many workers have prepared monoclonal antibodies (Mabs) by recombinant DNA techn.ques. Mon- 
oclonal antibodies are highly specific well-characterized molecules in both primary and tertiary structure. 
They have been widely used for in vitro immunochemical characterization and quantitation of ant.gens. 
Genes for heavy and light chains hlve~bien introduced into appropriate hosts and expressed, followed by 
reaoqreqation of the individual chains into functional antibody molecules (see. for example. Munro. Nature 
312 597 (1984)- Morrison. S.L.. Science 229:1202 (1985); Oi et aL. Biotechniques 4:214 (1986): Wood et al.. 
NiTure 314 446-449 (1985)). Light- and"he"avy-chain variable regions have been cloned and expressed m 
foTiiin h5$t$ wherein they maintained their binding ability (Moore et aL, European Patent Appl.cat.on 
0088994 (published September 21, 1983)). 

Chimeric or hybrid antibodies have also been prepared by recombinant DNA techn.ques. 0. and 
Morrison. Biotechniques 4:214 (1986) describe a strategy for producing such chimeric ant.bod.es wh.ch 
include a chimeric humanlgG anti-leu3 antibody. 

Gascoigne N.R.J.. et aL. Proc. Natl. Acad. Sci. (USA) 84:2936-2940 (1987) disclose the preparation of a 
chimeric gene construcTcSFuSSSg a^elTrTceWTcFain-variable (V) domain and the constant (C) reg.on 
coding sequence of an immunoglobulin 7 2a molecule. Cells transfected with the chimeric gene synthesize a 
protein product that expresses immunoglobulin and T-cell receptor antigenic determinants as we as proton 
A binding sites. This protein associates with a normal X chain to form an apparently normal tetramer.c 
(H5L2 where H = heavy and L = light) immunoglobulin molecule that is secreted. 

Sharon J et aL. Nature 309:54 (1984). disclose construction of a chimeric gene encod.ng the variable 
(V) region of a mo'seT^ chlin specific for the hapten azophenylfrsonate and the constant (C) reg.on of 
a mouse kappa light chain (V„C K ). This gene was introduced into a mouse myeloma cell line. The chimeric 
gene was expressed to give a protein which associated with light chains secreted from the myeloma cell 
line to give an antibody molecule specific for azophenylarsonate. 

Morrison Science 229:1202 (1985). discloses that variable light-or variable heavy-chain regions can be 
attached to a l^ig-siqTJence to create fusion proteins. This article states that the potential uses for the 
45 fusion proteins are three: (1) to attach antibody specifically to enzymes for use in assays; (2) to isolate non- 
lo proteins by antigen columns; and (3) to specifically deliver toxic agents. 

Recent techniques for the stable introduction of immunoglobulin genes into myeloma cells (Baner,.. J. 
et al Cell 33 729-740 (1983); Potter. H.. et aL. Proc. NatL Acad. Sci. (USA) 81:7161-7165 (1984)), coupled 
with deTaiied"structural information, have pVmittedThe use of in vitro DNA methods such as mutagenesis, 
so to qenerate recombinant antibodies possessing novel properties. 

PCT Application W087/02671 discloses methods for producing genetically engineered antibodies o 
desired variable region specificity and constant region properties through gene cloning and expression o 
liqht and heavy chains. The mRNA from cloned hybridoma B cell lines which produce monoclonal 
antibodies of desired specificity is isolated for cDNA cloning. The generation of light and heavy cha.n 
coding sequences is accomplished by excising the cloned variable regions and ligat.ng them to light or 
heavy chain module vectors. This gives cDNA sequences which code for immunoglobulin cha.ns. The lack 
of introns allows these cDNA sequences to be expressed in prokaryotic hosts, such as bacter.a. or in lower 
eukaryotic hosts, such as yeast. 
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The generation of chimeric antibodies in which the antigen-binding portion of -the immunoglobulin is 
fused to other moieties has been demonstrated. Examples of non-immunoglobuiin genes fused to anti- 
bodies include Stanphylococcus aureus nuclease, the mouse oncogene c-myc, and the Klenow fragment of 
E. coii DNA polymerase I (Neuberger, M.S., et aL Nature 312:604-612 (1984); Neuberger, M.S.. Trends in 

5 Biochemical Science, 347-349 (1985)). European Patent Application 120,694 discloses the genetic engineer- 
ing of the variable and constant regions of an immunoglobulin molecule that is expressed in E. coli host 
cells. It is further disclosed that the immunoglobulin molecule may be synthesized by a host cell with 
another peptide moiety attached to one of the constant domains. Such peptide moieties are described as 
either cytotoxic or enzymatic. The application and the examples describe the use of a lambda-like chain 

70 derived from a monoclonal antibody which binds to 4-hydroxy-3-nitrophenyl (NP) haptens. 

European Patent Application 125.023 relates to the use of recombinant DNA techniques to produce 
immunoglobulin molecules that are chimeric or otherwise modified. One of the uses described for these 
immunoglobulin molecules is for whole-body diagnosis and treatment by injection of the antibodies directed 
to specific target tissues. The presence of the disease can be determined by attaching a suitable label to 

75 the antibodies, or the diseased tissue can be attacked by carrying a suitable drug with the antibodies. The 
application describes antibodies engineered to aid the specific delivery of an agent as "altered antibodies/ 
PCT Application W083/101533 describes chimeric antibodies wherein the variable region of an im- 
munoglobulin molecule is linked to a portion of a second protein which may comprise the active portion of 
an enzyme. 

20 Boulianne et al., Nature 312:643 (1984) constructed an immunoglobulin gene in which the DNA 
segments that encode mouse "variable regions specific for the hapten trinitrophenol (TNP) are joined to 
segments that encode human mu and kappa regions. These chimeric genes were expressed to give 
functional TNP-binding chimeric IgM. 

Morrison et al., P.N.A.S. (USA) 81:6851 (1984), disclose a chimeric molecule utilizing the heavy-chain 

25 variable region" exons of an anti-phosphoryl choline myeloma protein G, which were joined to the exons of 
either human kappa light-chain gene. The genes were transfected into mouse myeloma cell lines, 
generating transformed cells that produced chimeric mouse-human IgG with antigen-binding function. 

Despite the progress that has been achieved on determining the mechanism of HIV infection, a need 
continues to exist for methods of treating HIV viral infections. 

30 

SUMMARY OF THE INVENTION 



35 The invention relates to a gene comprising a DNA sequence which encodes a fusion protein comprising 
1) CD4. or a fragment thereof which binds to HIV gpl20, and 2) an immunoglobulin light or heavy chain; 
wherein said CD4 or HIV gpl20-binding fragment thereof replaces the variable region of the light or heavy 
immunoglobulin chain. * 

The invention also relates to vectors containing the gene of the invention and hosts transformed with the 

40 vectors. 

The invention also relates to a method of producing a fusion protein comprising CD4, or fragment 
thereof which binds to HIV gpl20, and an immunoglobulin light or heavy chain, wherein the variable region 
of the immunoglobulin light or heavy chain has been substituted with CD4, or HIV gpl20-binding fragment 
thereof, which comprises: 

45 cultivating in a nutrient medium under protein producing conditions, a host strain transformed with the 

vector containing the gene of the invention, said vector further comprising expression signais which are 

recognized by said host strain and direct expression of said fusion protein, and 

recovering the fusion protein so produced. 

The invention also relates to a fusion protein comprising CD4, or fragment thereof which is capable of 
50 binding to HIV gp120, fused at the C-terminus to a second protein which comprises an immunoglobulin light 

or heavy chain, wherein the variable region of said light or heavy chain is substituted with CD4 or a HIV 

gpl20 binding fragment thereof. 

t The invention also relates to an immunoglobulin-like molecule comprising the fusion protein of the 
invention together with an immunoglobulin light or heavy-chain, wherein said immunoglobulin like molecule 
55 binds HIV gpl20. 

The IgGl fusion proteins 'and immunogiobuiin-like molecules may be useful for both complement- 
mediated and cell-mediated (ADCC) immunity, while the IgM fusion proteins are useful principally through 
complement-mediated immunity. 
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The invention also relates to a complex between the fusion proteins and immunoglobulin-like molecule 

of the invention and HIV gpl 20. 

The invention also relates to a method tor treating HIV or SIV infections comprising administering Re- 
fusion protein or immunoglobulin-like molecule of the invention to an animal. 

The invention further relates to a method for detecting HIV gpl20 in a sample comprising contacting a 
sample suspected of containing HIV or gpl20 with the fusion protein or immunoglobulin-like molecule of the 
invention, and detecting whether a complex has formed. 



DESCRIPTION OF THE PREFERRED EMBODIMENTS 



The invention is directed to a protein gene which comprises 

1) a DNA seouence which codes for CD4, or fragment thereof which binds to HIV gpi20, fused to 
is 2) a DNA sequence which encodes an immunoglobulin heavy chain. 

Preferably, the antibody has effector function. 

The invention is also directed to a protein gene which comprises 

1) a DNA sequence which codes for CD4. or fragment thereof which binds to HIV gpi20, fused to 
20 2) a DNA sequence which encodes an immunoglobulin light chain; wherein said sequence which 

codes for CD4, or HIV gp120-binding fragment thereof, replaces the variable region of the light im- 
munoglobulin chain. 

The invention is also directed to the expression of these novel fusion proteins in transformed hosts and 
25 the use thereof to treat and diagnose HIV infections. In particular, the invention relates to expressing sa.d 
genes in mammalian hosts which express complementary light or heavy chain immunoglobulins to give 
immunoglobulin-like molecules which have antibody effector function and also bind to HIV or SIV gpl20. 

The term "antibody effector function" as used herein denotes the ability to fix complement or to 
activate ADCC. 

30 The fusion proteins and immunoglobulin-like molecules may be administered to an animal for the 
purpose of treating HIV or SIV infections. By the terms "HIV infections" is intended the condition of having 
AIDS AIDS related complex (ARC) or where an animal harbors the AIDS virus, but does not exhibit the 
clinical symptoms of AIDS or ARC. By the terms "SIV infections" is intended the cond.tion of be.ng infected 
with simian immunodeficiency virus. 

By the term "animal" is intended all animals which may derive benefit from the administration of the 
fusion proteins and immunoglobulin-like molecules of the invention. Foremost among such an.mals are 
humans however, the invention is not intended to be so limited. 

By the term "fusion protein" is intended a fused protein comprising CD4, or fragment thereof which is 
capable of binding to gpl20. linked at its C-terminus to an immunoglcf>ulin chain wherein a port.on of the N- 
terminus of the immunoglobulin is replaced with CD4. In general, that portion of immunoglobulin wh.ch .s 
deleted is the variable region. The fusion proteins of the invention may also comprise immunoglobulins 
where more than just the variable region has been deleted and replaced with CD4 or HIV gpl20 b.nd.ng 
fragment thereof. For example, the V„ and CH1 regions of an immunoglobulin chain may be deleted. 
Preferably any amount of the N-terminus of the immunoglobulin heavy chain can be deleted as long as the 
remaining fragment has antibody effector function. The minimum sequence required for b.nd.ng com- 
plement encompasses domains CH2 and CH3. Joining of Fc portions by the hinge region is advantageous 
for increasing the efficiency of complement binding. 

The CD4 portion of the fusion protein may comprise the complete CD4 sequence, the 370 ammo acid 
extracellular region and the membrane spanning domain, or the extracellular region. The fusion protein may 
so comprise fragments of the extracellular region obtained by cutting the DNA sequence wh.ch encodes CD4 
at the BspM1 site at position 514 or the Pvull site at position 629 (see Table 1) to give nucleot.de 
sequences which encode CD4 fragments which retain binding to gpl20. In general, any fragment of CD4 
may be used as long as it retains binding to gpl20. 

Where the fusion protein comprises an immunoglobulin light chain, it is necessary that no more of the 
55 Ig chain be deleted than is necessary to form a stable complex with a heavy chain Ig. In part.cu iar, the 
cysteine residues necessary for disulfide bond formation must be preserved on both the heavy and light 

chain moieties. ,. . 

When expressed in a host, e.g., a mammalian cell, the fusion protein may associate with other light or 
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heavy lg chains secreted by the cell to give a functioning immunoglobuiin-like_ molecule which is capable of 
binding to gpl20. The gpt20 may be in solution, expressed on the surface of infected cells, or may be 
present on the surface of the HIV virus itself. Alternatively, the fusion protein may be expressed in a 
mammalian cell which does not secrete other light or heavy lg chains. When expressed under these 
conditions, the fusion protein may form a homodimer. 

Genomic or CDNA sequences may be used ir> the practice of the invention. Genomic sequences are 
expressed efficiently in myeloma cells, since they contain native promoter structures. 

The constant regions of the antibody cloned and used in the chimeric immunoglobuiin-like molecule 
may be derived from any mammalian source. The constant regions may be complement binding or ADCC 
active. However, preliminary work (see Examples) indicates that the fusion proteins of the invention may 
mediate HIV or SIV infected cell death by an ADCC or complement-independent mechanism. The constant 
regions may be derived from any appropriate isotype, including lgG1. lgG3. or IgM. 

The joining of various DNA fragments, is performed in accordance with conventional techniques, 
employing blunt-ended or staggered-ended termini for ligation, restriction enzyme digestion to provide 
75 appropriate termini, filling in of cohesive ends as appropriate, alkali and phosphatase treatment to avoid 
undesirable joining, and ligation with appropriate ligases. The genetic construct may optionally encode a 
leader sequence to allow efficient expression of the fusion protein. For example, the leader sequence 
utilized by Maddon et al., Cell 42:93-104 (1985) for the expression of CD4 may be used. 

For cDNA, the ~cDNA~~may~be cloned and the resulting clone screened, for example, by use of a 
20 complementary probe or by assay for expressed CD4 using an antibody as disclosed by Dalgleish et ah, 
Nature 312:763-766 (1984); Klatzmann et aL Immunol. Today 7:291-297 (1986); McDougal et ah, J. 
lmmunorT35:3l51-3l62 (1985); and McDougal, J. et al., J. Immunol. 137:2937-2944 (1986). 

To"~express the fusion hybrid protein, transcriptional and translationai signals recognized by an 
appropriate host element are necessary. Eukaryotic hosts which may be used include mammalian cells 
25 capable of culture in vitro, particularly leukocytes, more particularly myeloma cells or other transformed or 
oncogenic lymphoc7tes7e.g.. EBV-transformed cells. Alternatively, non-mammalian cells may be employed, 
such as bacteria, fungi, e.g., yeast, filamentous fungi, or the like. 

Preferred hosts for fusion protein production are mammalian cells, grown in vitro in tissue culture or in 
vivo in animals. Mammalian cells provide post translationai modification to immunoglobulin protein mol- 
30 ecules which provide for correct folding and glycosylation of appropriate sites. Mammalian cells which may 
be useful as hosts include cells of fibroblast origins such as VERO or CHO-K1 or cells of lymphoid origin, 
such as the hybridoma SP2/0-AG14 or the myeloma P3x63Sgh, and their derivatives. For the purpose of 
preparing an immunoglobulin-like molecule, a plasmid containing a gene which encodes a heavy chain 
immunoglobulin, wherein the variable region has been replaced with CD4 or fragment thereof which binds to 
35 gpl20, may be introduced, for example, into J558L myeloma cells, a mouse plasmacytoma expressing the 
lambda-1 light chain but which does not express a heavy chain (see Oi et aL. P.N.A.S. (USA) 80:825-829 
(1983)). Other preferred hosts include COS cells. BHK cells and hepatoma cells. 

The constructs may be joined together to form a single DNA segment or may be maintained as 
separate segments, by themselves or in conjunction with vectors. 
40 Where the fusion protein is not glycosylated, any host may be used to express the protein which is 
compatible with replicon and control sequences in the expression plasmid. In general, vectors containing 
replicon and control sequences are derived from species compatible with a host cell are used in connection 
- with the host. The vector ordinarily carries a replicon site, as well as specific genes which are capable of 
providing phenotypic selection in transformed cells. The expression of the fusion protein can also be placed 
45 under control with other regulatory sequences which may be homologous to the organism in its untransfor- 
med state. For example, lactose-dependent E. coli chromosomal DNA comprises a lactose or lac operon 
which mediates lactose utilization by elaborating the enzyme beta-galactosidase. The lac control elements 
may be obtained from bacterial phage lambda placS, which is infective for E. coli. The lac promoter- 
operator system can be induced by IPTG. 
so Other promoters/operator systems or portions thereof can be employed as well. For example, colicin 
El, galactose, alkaline phosphatase, tryptophan, xylose, tax. and the like can be used. 

For mammalian hosts, several possible vector systems are available for expression. One class of 
vectors utilize DNA elements which are derived from animal viruses such as bovine papilloma virus, 
polyoma virus, adenovirus, vaccinia virus, baculovirus. retroviruses RSV, MMTV or MOMLV), or SV40 virus. 
55 Cells which have stably integrated the DNA into their chromosomes may be selected by introducing one or 
more markers which allow selection of transfected host celts. The marker may provide for prototropy to an 
auxotrophic host, biocide resistance, e.g., antibiotics, or heavy metals such as copper or the like. The 
selectable marker gene can be either directly linked to the DNA sequences to be expressed, or introduced 
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into the same cell by ^transformation. Additional elements may also be needed for optima, synthesis of - 
mRNA These elements may include splice s.gnals. as well as transcriptional promoters, enhance^ and . . . 
Termination signals. The cDNA expression vectors incorporating such elements includes those described by 
Okavama H Mol. Cel. Biol.. 3:280 (1983) and others. 

, On"e the^to— or-DNA-sequence containing the constructs has been prepared for expression, the 
DNA constructs may be introduced to an appropriate host. Various techniques may be employed, such as 
pTopias fusion, calcium phosphate precipitation, electroporation or other conventual techniques. .Mm 
me °u on the ceHs are grown in media and screened for the appropriate activity. Express* , of he genets 
lesu'ts tn production of the fus.on protein. Th.s expressed fusion protein may then be sud.ect to funher 

to assembly to form the immunoglobulirvlike molecule. or 
The host cells for immunoglobulin production may be immortalized cells, primarily myeloma or 
lymphoma c , is These cells may be grown in appropriate nutrient medium in culture flasks or ,n,ected into 
a Tvneroistic host e g mouse or a rat. or immunodeficient host or host site. e.g.. nude mouse or hamster 
pouc ^ r artcu ar the cells may be introduced into the abdominal cav.ty of an animal to a.low production 

, 5 of ascites fluid which contains the immunog.obulin-like molecule. Alternatively, the cells may be in.ected 
subcutneousfy and the chimeric antibody is harvested from the blood of the host. The cells may be used 
n me same manner as hybridoma cel.s. See Diamond et aL N. Eng_ J_ Med_ 304:1344 (1981 . and Kennatt. 
McKearn and Bechto, (Eds.). Monoclonal Antibodies: Hybrioomas: - A New Dimension in B^logjc Anah^s. 

20 P ' e Te fuSJn prote,ns and immunog.obu.in-.ike molecules of the invention may be isolated and IpuriMc I in 
accordance with conventiona. conditions, such as extraction, prec.prtat.on. chromatography aftn.ty 
ch omatog!aphy. electrophoresis or the like. For examp.e. the I 9 G1 fusion prote.ns may be punf.ed by 
pa s^ng a'solunon through a column wh.ch contains immobilized protein A or ^J««J 
binds 5* Fc portion of the fusion protein. See. for examp.e. Re,s. K.J.. et aL. J. Immunol Ig^ 8 , 31 ^ 

25 (1984)- PCT Application. Publication No. W087/00329. The chimeric anfbooy may the be eluted by 
treatment with a chaotropic salt or by elution with aqueous acetic acid (1 M). 

Mernatively the fusion proteins may be purified on anti-CD4 antibody columns, or on mtnm- 

TXSnS^^ CDNA sequences which encode CD4. or a fragment thereof 
30 binds gpl20, may be ligated into an expression plasmid which codes for an antibody wherein the variable 
££• 'ol the genl has been deleted. Methods for the preparation of genes which encode 
chain constant regions of immunoglobulins are taught, for example, by Robinson, R. et ah. PCT Application. 

PUb SSed inT^gS^nke molecules which contain CD4. or fragments thereof, contain the constant 
35 region of an IgM. IgGl or lgG3 antibody which binds complement at the Fc region. 

9 The fusion protem and immunoglobu.in-like molecules of (he invention may be used for the treatment of 
HIV viral infections. The fusion protein complexes to gpl20 which is expressed on ; ^fected ce. s^lthough 
the inventor is not bound by a particular theory, it appears that the Fic portion of ****Z*^J£% 
may bind with complement, which mediates destruction of the cell. In th.s manner, .nfected cells are 
40 destroyed so that additional viral particle production is stopped. ^ Iar .„, a n , thB 

For the purpose of treating HIV infections, the fusion protein or ,mmunoglobul,n-l,ke molecule of the 
invention may additionally contain a rad.olabel or therapeutic agent which enhances destruction of the H.V 

Samples o^radSsXes which can be bound to the fusion protein or immunog,obu.in-.ike molecule of 
« the tZZ for use in H.V therapy are '3-,. 50 Y, "Cu. -Bi. -At. »«Pb. -Sc. and -Pd. Optionally, 
a "abe' su h as boron can be used which emits a and , particles upon bombardment with neutron nriuttn. 

For in vivo diagnosis radionuclides may be bound to the fusion protein or immunog.obul.n-.ike 
molecule-of-the" invention either directly or by using an intermediary functional group. An intermediary group 
which is often used to bind radioisotopes, which exist as metallic cations, to am.bodes s 
50 dieSlenetriaminepentaacetic acid (DTPA). Typical examples of metallic cations wh.ch are bound in this 
manner are " m Tc ,23 l nt ln ,31 1, 57 Ru. 67 Cu. 67 Ga. and e8 Ga. 

Moreover, the fusion protem and immunog.obulin-like molecule of the invention may be tagged wt i an 
NMR imaging agent wh.ch inc.ude paramagnetic atoms. The use of an NMR imagmg agent a I ows ^ he -n 
Vvc diagnosis of the presence of and the extent of H.V infection within a patient using NMR techniques. 
55 EiTments which are particularly useful in this manner are ^ 7 Gd. 5 = Mn. Dy. -<Cr and - i-e. 

Therapeutic agents may inc.ude. for examp.e. bacteria, toxins such as diphtheria toxin. o, J^- ^odj 
lor producing fusion proteins comprising fragment A of diphtheria tox.n are taught ,n U.S. Paten ^^675.382 
1987) Diphtheria toxin contains two po.ypeptide chains. The B chain binds the tox.n to a receptor on a cel. 
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surface. The A chain actually enters the cytoplasm and inhibits protein synthesis, by. inactivating elongation 
factor 2, the factor that translocates ribosomes along mRNA concomitant with hydrolysis of ETP. See 
Darnell. J., et al. t in Molecular Cell Biology , Scientific American Books, Inc., page 662 (1986). Alternatively, 
a fusion protein comprising ricin. a toxic lectin, may be prepared. 

5 Introduction of the chimeric molecules by gene therapy may also be contemplated, for exc .pie, using 
retroviruses or other means to introduce the genetic material encoding the fusion proteins into suitable 
target tissues. In this embodiment, the target tissues having the cloned genes of the invention may then 
produce the fusion protein in vivo. 

The dose ranges for the administration of the fusion protein or immunoglobulin-!ike molecule of the 

70 invention are those which are large enough to produce the desired effect whereby the symptoms of HIV or 
SIV infection are ameliorated. The dosage should not be so large as to cause adverse side effects, such as 
unwanted cross-reactions, anaphylactic reactions, and the like. Generally, the dosage will vary with the age. 
condition, sex and extent of disease in the patient, counterindications, if any, immune tolerance and other 
such variables, to be adjusted by the individual physician. Dosage can vary from .01 mg/kg to 50 mg/kg, 

75 preferably 0.1 mg/kg to 1.0 mg/kg, of the. immunoglobulin-like molecule in one or more administrations 
daily, for one or several days. The immunoglobulin-like molecule can be administered parenterally by 
injection or by gradual perfusion over time. They can be administered intravenously, intraperitoneal^, 
intramuscularly, or subcutaneously. 

Preparations for parenteral administration include sterile or aqueous or non-aqueous solutions, suspen- 

20 sions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, 
vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include 
water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media. Paren- 
teral vehicles include sodium chloride solution. Ringer's dextrose, dextrose and sodium chloride, lactated 
Ringer's, or fixed oils. Intravenous vehicles include fluid and nutrient replenishes, electrolyte replenishes. 

25 such as those based on Ringer's dextrose, and the like. Preservatives and other additives may also be 
present, such as. for example, antimicrobials, antioxidants, chelating agents, inert gases and the like. See, 
generally, Remington's Pharmaceutical Science , 16th Ed.. Mack Eds.. 1980. 

The invention also relates to a method for preparing a medicament or pharmaceutical composition 
comprising the components of the invention, the medicament being used for therapy of HIV or SIV infection 

30 in animals. 

The detection and quantitation of antigenic substances and biological samples frequently utilized 
immunoassay techniques. These techniques are based upon the formation of the complex between the 
antigenic substance, e.g., gpl20, being assayed and an antibody or antibodies in which one or the other 
member of the complex may be detectably labeled. In the present invention, the immunoglobulin-like 

35 molecule or fusion protein may be labeled with any conventional label. 

Thus, the hybrid fusion protein or immunoglobulin-like molecule of the invention can also be used in 
assay for HIV or SIV viral infection in a biological sample by contacting a sample, derived from an animal 
suspected of having an HIV or SIV infection, with the fusion protein or immunoglobulin-like molecule of the 
invention, and detecting whether a complex with gpl20, either alone or on the surface of an HIV-infected 

40 cell, has formed. 

For example, a biological sample may be treated with nitrocellulose, or other solid support which is 
.capable of immobilizing cells, cell particles or soluble protein. The support may then be washed with 
suitable buffers followed by treatment with the fusion protein which may be detectably labeled. The solid 
phase support may then be washed with the buffer a second time to remove unbound fusion protein and 
45 the label on the fusion protein detected. 

In carrying out the assay of the present invention on a sample containing gpl20, the process 
comprises: 

a) contacting a sample suspected containing gpl20 with a solid support to effect immobilization of 
gpi20. or cell which expresses gpi20 on its surface; 
so b) contacting said solid support with the detectably labeled immunoglobulin-like molecule or fusion 

protein of the invention; 

c) incubating said detectably labeled immunoglobulin-like molecule with said support for a sufficient 
amount of time to allow the immunoglobulin-like molecule or fusion protein to bind, to the immobilized 
gpl20 or cell which expresses gpi20 on its surface; 
55 d) separating the solid phase support from the incubation mixture obtained in step c); and 

e) detecting the bound immunoglobulin-like molecule or fusion protein and thereby detecting and 
quantifying gp120. 
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Alternatively, labeled immunoglobulin-like molecule (or fusion protein) -gpt20 complex in a sample may _ 
be separated from a reaction mixture by contacting the complex with an immobilized antibody or protein 
which is specific for an immunoglobulin or, e.g., protein A, protein G, anti-IgM or anti-lgG antibodies. Such 
antiimmunoglobulin antibodies may be monoclonal or polyclonal. The solid support may then be washed 
s with suitable buffers to give an immobilized gpi20-labeled immunoglobulin-like molecule antibody complex. 
The label on the fusion protein may then.be detected to give a measure of endogenous gpi20 and. 
thereby, the presence of HIV. 

This aspect of the invention relates to a method for detecting HIV or SIV viral infection in a sample 

comprising 

to (a) contacting a sample suspected of containing gpl20 with a fusion protein or immunoglobulin-hke 

molecule comprising CD4, or fragment thereof which binds to gpl20, and the Fc portion of an im- 
munoglobulin chain, 

(b) detecting whether a complex is formed. 

75 The invention also relates to a method of detecting gpl20 in a sample, further comprising 

(c) contacting the mixture obtained in step (a) with an Fc binding molecule, such as an antibody, 
protein A. or protein G. which is immobilized on a solid phase support and is specific for the hybrid fusion 
protein, to give a gp120 fusion protein-immobilized antibody complex 

(d) washing the solid phase support obtained in step (c) to remove unbound fusion protein. 
20 (e) and detecting the label on the hybrid fusion protein. 

Of course, the specific concentrations of detectably labeled immunoglobulin-like molecule (or fusion 
protein) and gpl20, the temperature and time of incubation, as well as other assay conditions may be 
varied, depending on various factors including the concentration of gpl20 in the sample, the nature of the 
25 sample, and the like. Those skilled in the art wild be able to determine operative and optimal assay 
conditions for each determination by employing routine experimentation. 

Other such steps as washing, stirring, shaking, filtering and the like may be added to the assays as is 
customary or necessary for the particular situation. 

One of the ways in which the immunoglobulin-like molecule or fusion protein of the present invention 
can be detectably labeled is by linking the same to an enzyme. This enzyme, in turn, when later exposed to 
its substrate, will react with the substrate in such a manner as to produce a chemical moiety which can be 
detected as. for example, by spectrophotometry fluorometric or by visual means. Enzymes which can be 
used to detectably label the immunoglobulin-like molecule or fusion protein of the present invention include, 
but are not limited to. malate dehydrogenase, staphylococcal nuclease. delta-V-steroid isomerase. yeast 
alcohol dehydrogenase, alpha-glycerophosphate dehydrogenase, those phosphate isomerase, horseradish 
peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, 
catalase, giucose-VI-phosphate dehydrogenase, glucoamylase and acetylcholine esterase. 

The immunoglobulin-like molecule or fusion protein of the present invention may also be labeled with .a 
radioactive isotope which can be determined by such means as* the use of a gamma counter or a 
40 scintillation counter or by autoradiography. Isotopes which are particularly useful for the purpose of the 
present invention are: 2 H, ■■»!. 131 l, 3s S , i*c. 51 Cr. »CI. 57 Co, 58 Co. 5S Fe and 75 Se. 

It is also possible to label the immunoglobulin-like molecule or fusion protein with a fluorescent 
compound. When the fluorescently labeled immunoglobulin-like molecule is exposed to light of the proper 
wave length, its presence can then be detected due to the fluorescence of the dye. Among the most 
45 commonly used fluorescent labelling compounds are fluorescein isothiocyanate. rhodamine. phycoerythenn, 
phycocyanin, allophycocyanin, o-phthaldehyde and fluorescamine. 

The immunoglobulin-like molecule or fusion protein of the invention can also be detectably labeled 
using fluorescence emitting metals such as l52 Eu. or others of the lanthanide series. These metals can be 
attached to the immunoglobulin-like molecule or fusion protein using such metal chelating groups as 
so diethylenetriaminepentaacetic acid (DTPA) or ethylenediaminetetraacetic acid (EDTA). 

The immunoglobulin-rike molecule or fusion protein of the present invention also can be detectably 
labeled by coupling it to a chemiluminescent compound. The presence of the chemiluminescent-tagged 
immunoglobulin-like molecule or fusion protein is then determined by detecting the presence of lumines- 
cence that arises during the course of a chemical reaction. Examples of particularly useful chemilumines- 
55 cent labeling compounds are luminol. isoluminol. theromatic acridinium ester, imidazole, acridinium salt and 
oxalate ester. 

Likewise, a bioluminescent compound may be used to label the immunoglobulin-like molecule or fusion 
protein of the present invention. Biolummescence is a type of chemiluminescence found in biological 
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systems in which a catalytic protein increases the efficiency of the ch^miluminescent reaction. The 
presence of a bioiuminescent protein is determined by detecting the presence of luminescence. Important 
bioluminescent compounds for purposes of labeling are luciferin, luciferase and aequorin. 

Detection of the immunoglobulin-like molecule or fusion protein may be accomplished by a scintillation 

5 counter, for example, if the detectable label is a radioactive gamma emitter, or by a fluorometer, for 
example, if the label is a fluorescent material. In. the case of an enzyme label, the detection can be 
accomplished by colorimetric methods which employ a substrate for the enzyme. Detection may also be 
accomplished by visual comparison of the extent of enzymatic reaction of a substrate in comparison with 
similarly prepared standards. 

to The assay of the present invention is ideally suited for the preparation of a kit. Such a kit may comprise 
a carrier means being compartmentalized to receive in close confinement therewith one or more container 
means such as vials, tubes and the like, each of said container means comprising the separate elements of 
the immunoassay. For example, there may be a container means containing a solid phase support, and 
further container means containing the detectably labeled immunoglobulin-like molecule or fusion protein in 

is solution. Further container means may contain standard solutions comprising serial dilutions of analytes 
such as gpl20 or fragments thereof to be detected. The standard solutions of these analytes may be used 
to prepare a standard curve with the concentration of gpl20 plotted on the abscissa and the detection 
signal on the ordinate. The results obtained from a sample containing gpl20 may be interpolated from such 
a plot to give the concentration of gpl20. 

20 The immunoglobulin-like molecule or fusion protein of the present invention can also be used as a stain 
for tissue sections. For example, a labeled immunoglobulin-like molecule comprising CD4 or fragment 
thereof which binds to gp120 may be contacted with a tissue section, e.g., a brain biopsy specimen. This 
section may then be washed and the label detected. 

The following examples are illustrative, but not limiting the method and composition of the present 

25 invention. Other suitable modifications and adaptations which are obvious to this skill in the art are within 
the spirit and scope of this invention. 



EXAMPLES 

30 

Example 1 : Preparation of CD4-lg cDNA Constructs 

35 The extracellular portion of the CD4 molecule (See Madden, P.J.. et aL. Cell 42:93-104 (1985)) was 
fused at three locations in a human IgGl heavy chain constant region gene by means of a synthetic splice 
donor linker molecule. To exploit the splice donor linker, a BamHI linker having the sequence 
CGCGGATCCGCG was first inserted at amino acid residue 395 of the CD4 orecursor sequence (nucleotide 
residue 1295). A synthetic splice donor sequence 

40 

GATCCCGAGGGTGAGTACTA 

GGCTCCCACTCATGATTCGA 



45 



50 



55 



bounded by BamHI and Hindlll complementary ends was created and fused to the Hindlll site in the intron 
preceding the CH1 domain, to the Espl site in the intron preceding the hinge domain, and to the Ban! site 
preceding the CH2 domain of the IgGl genomic sequence. Assembly of the chimeric genes by ligation at 
the BamHI site afforded molecules in which either the variable (V) region, the V + CH1 regions, or the V. 
CH1 and hinge regions were replaced by CD4. In the last case, the chimeric molecule is expected to form a 
monomer structure, while in the former, a dimeric molecule is expected. 

On such genetic construct which contains the DNA sequence which encodes CD4 linked to human IgGl 
at the Hind3 site upstream of the CH1 region (fusion protein CD4H 7 1) is depicted in Table 1. The plasmid 
contammi"this genetic construct (pCD4H 7 l) has been deposited in E. coli (MC1061/P3) at the American 
Type Culture Collection (ATCC) under the terms of the Budapest Treaty and given accession number 
67611. 

A second genetic construct which contains the DNA sequence which encodes CD4 linked to human 
IgGl at the Esp site upstream of the hinge region (fusion protein CD4E>1) is depicted in Table 2. The 
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p,asmid containing this genetic construct (pCD4E 7 D has been deposited m E_ cel. (MC1061-P3) at (he 
ATCC under the terms of the Budapest Treaty and given accession number 67610. 

A third genetic construct which contains the DNA sequence which encodes CD4 hnked to human IgM at 
the Mst2 site upstream of the CH1 region (fusion prote-n CD4Mu) is depicted in Table 3. The , plasmid 
5 contiTnTng tn!s genet,c construct <PCD4Mu) has been deposited in E coli (MC106VP3) at the ATCC under 
thfi tPrms of the Budapest Treaty and given accession number 67608. 

A fourth genetic construct which contains the DNA sequence which encodes CD4 linked to human IgM 
at the Pst site upstream of the CH2 region (fusion protein CD4Pu) is depicted in Table 4. The plasm<d 
IZM S geneTc construct (PCD4Pu) has been deposited in E_ coli (MC1061/P3) at the ATCC under 
to the terms of the Budapest Treaty and given accession number 6/609. 

A fTfth genetic construct wh.ch contains the DNA sequence which encodes CD4 linked to human IgGl at 
the Bam site downstream from the hinge region (fusion protein CD4B 7 1 ) is depicted ,n Table 5. 

Tw"o similar constructs were prepared from the human IgM heavy cha.n constant region by fusion with 
the introns upstream of the u CH1 and CH2 domains at an MStll site and a PStl site respectively. The 
, 5 fusions were made by joining the PStl site of the CD4/I 9 G1 construct fused at the Esp site in IgGl gene to 
the MStll and Pst sites in the IgM gene. In the first instance, this was performed by treatment of the Pst end 
with T4 DNA Polymerase and the MStll end with E. coli DNA Polymerase, followed by ligation; and m the 

second instance, by ligation alone. 

immunoprecipitation of the fusion proteins with a panel of monoclonal antibodies directed against C04 
20 epitopes showed that all of the epitopes were preserved. A specific high affinity association ,s demonstrated 
between the chimeric molecules and HIV envelope proteins expressed on the suriace of cells transfected 
with an attenuated (reverse transcriptase deleted) proviral construct. 
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Table 1 



FN SB 

N S B M H DHA S 

UP B N G RAU T 

4 B V L A AE9 X 

H 2 1 1 1 236 1 

CCCTGTnGACAACCAGCGGCCAAGAAACACCCAACCCCACAGGCCCTGCCATTTCTCTC 
CGCACAAACTCTTCGTCGCCCGTTCm 

B PS S ^ 

DBS ADNPA D OHNA M HM HNC 

DAP VRLUU 0 RALU N AN PCR 

EN1 AAAM.9 E AEA9 L EL AIF 

122 22416 1 2346 1 31 211 

GGCTCAGGTCCCTACTGGCTCAGCCCCCTGCCTCCCTCCGCAAGGCCACAATCAACCGGC 
61 CCGAGTCCAGGGATGACCCAG^ 

M N R G - 

H F F 

V B N HH NMD 

1 B U HA U N D 

5 V * AE 4 L E 

1 1 H 12 H 1 1 

GAGTCCCTTTTAGGCACrrGCnCTGGTCCTCCAACTCGCGCTCCTCCCAGCAGCCACTC 

12 i ♦ ♦ 180 

CTCAGGGAAAATCCGTGAACGAAGACCACGACGTTGACCGCGAGGAGGGTCGTCGCTCAG 

VPFRHLLLVLQLALLPAATQ- 

B EE * J A 

8 C C A U 

V 0 0 A " 

1 K K * 1 

AGGGAAAGAAACTGGTGCTGGGCAAAAAACGCGATACAGTCGAACTGACCTGTACACCTT 

18 i * ♦ * ♦ ♦ * 240 

TCCCTTTCTT7CACCACCACCCGTTTTTTCCCCTATGTCACCTTGACTGCACATCTCGAA 

CKKVVLCKKGDTVELTCTAS- 
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. .. H 

" » N 

B F 

0 0 ; 
2 2 



241 CGCrC^'cnaCCrArcnAiGCrcrc'c^CACCTTGCTaATTTCTAACACCCTT 

Q K K S I Q F H * K N S N Q 1 K I L G N - 



s S F H 

° VU u 3 DA F 



B r ii A A N H I 



AN1 K . A9 ,911 

422 1 26 l A 2 1 1 

ATCAGCGCTCCTTCTTAACTAAACGTCCATCCAAGCTCAATGATCCCGCTGAG ^ 

301 TACTCCCGAGGAACAA^GAT^ 

Q G S F L T K G P S K L N D R A D S R R - 

S S H H 

MANAS fA I A ID 

BVLITT CU N F f 

0AA9Y L3 j 2 11 

22461 1* 1 

GAAGCCmCGCACCAACGAAACTTCCCCCTGATCATCAAGAATCmACATAGAACACT ^ 

361 c^rGGlAirCCrGGnrcmGMCGG^CrAGTAGncnAGAATlCTATCTlCTGA 

slwdqcnfpli iknlki eds- 

M M AMAM J 

0 L AL9L ^ 

CACATACTTACATCTCTGA^GTCCAGGACCACAACGACCAGCTGCAATTGaACTCTTCG ^ 
421 crcrArGirTCrAGirAC^rACarcrGGTCnCCTCCTCCACGTTAACCATCACAACC 

D T Y I C E V E D Q K E E V Q L L V F C 
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B 

S s 

P T 

U Y 

1 1 
CATTGACTGCCAACTCTCACACCCACCT'GCTTCACCCCCAGACCCTGACCCTCACCTTGG ^ 

481 CTA>CTCACCC^CAGACTCTGCa 

LTANSOTHLLQCQSLTITLE- 



B BS 



H 



BS SC D MIS 

AP TR D N NT 

Nl NF E L F Y 

22 H 1 111 

AGACCCCCCCTCCTAGTAGCCCCTCAGTGCAATGTACGAGTCCAAGGGGTAAAAACATAC 

541 ♦ * 600 

TCTCGGGGGGACCATCATCCCGGAGTCACGTTACA7CCTCAGGTTCCCCATTTTTGTATG 

SPPGSSPSVQCRSPRCKNIQ- 

N BBH S B BS 

M MO ASP A BSSGSC S B N SC 

B NO LPV I APTIAR T A L TR 

0 LE UBU U N1NACF X N A NF 

2 11 122 1 221111 1 1 * 11 

AGGGGGCGAAGACCCTCTCCGTGTCTCACCTCGAGCTCCAGGATAGTGGCACCTGGACAT 

... «« * - -* * 660 

JCCCCCCC^CTGGGAGAGGCACAGAGTCGACCTCGAGGTCCTATCACCGTGGACCTGTA 



CGKTLSVSQLELQDSGTWTC- 



N 
NS 
LP 
AH 
31 



M | NM A 

B HA L 

0 EE U 

2 11 1 



CCACTGTCTTGCACAACCACAACAACC7CCACTTCAAAATACACATCCTCCTGCTACCTT 
CGTGACAGAACGTCTTCGTCTTCTTCCACCTCAAGTTTTATCTCTAGCACCACCATCCAA 
TVLRNRKKVEFKIDIVVLAF- 
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HS MM 

AT N N 

EU L L 

31 1 1 



TCCACAAGGCCTCCAGCATACTCTATAAGAAAGACCGGGAACACGTCCACTTCTCC ^ 

721 aggtcttccggaggtcctatca^ 



Q K A S S I V Y K 



K E C E Q V E F S F P 



» AM 

L L N 



781 



cactcgcctttacagttcaaaag^tgaccggcagtggcg^ 840 

F T V E K L T C S G E L W * Q A E R - 



L A 

P S 



H V FM A " 
P NLNU I 
H I" ML 3 J 

CGGCmCTCCTCCAAGK^ ^ 

841 CCCGAAGGAGGAGGTTCAGAACCT^ 

A S S S K S W I T F D L K N K E V S V K - 

B BS PS • u 

SM SCADNPAD A * » 

TA TRVRIUJD L h , „ 
EE NFAAAM9E U " " 

23 11224161 1 f 
AACGGGTTACCCAGGA^CTAAGCTCCAGATGCGCAAGAAGCTCCCGaCCACaCAa 

901 T^GrcrAlTrGGrccrGCCmrcrGGrcrArc'cGTTCTicGAGCGCGACCTGGACTCGG 

RVTQOPKLQMCKKtPLHLTL- 



EP 0 325 262 A2 



BSS- 

N TRAT 0 N P ™AN 

L NFEU E L J 7™ 

1 11 31 1 11 1 } 631 / 

TCCCCcicGCCTTGCaCACTATCCTGCCTCTCCAAACCTCAaaQGCCCTTC^GCGA ^ 

961 ACCGGGTCCCGAACGCAGTCATA^ 

PQALPHYACSCNLTLALEAK- 

F SC H 0 A 

A TR PD L 

N NF HE U 

; ii hi 

AAACAGGAAAGTTCCATCAGCAAGTGAACCTGCTGGTCATGAGAGCCACK ^ 

1021 '^IVd^l'^l^ 

T C K L H Q E V N L V V U R A T Q L Q K - 

PS s 

u ADNNPA DF AM DE A 

1 VRLLUU DA LN OS L 

[ AAAAV9 EN UL EP U 

! 224416 11 U 11 1 

AAAATTTGACCTGTGAGGTGTGCGCACCCACCTCCCCTAAGCTGATGC^ ^ 

1081 ^aUctgcacactccacacccct 

NLTCEVWGPTSPKLMLSLKL- 

" Q A * L ET 

J ? 2 1 12 

TCCAGAACAACCAGGCAAAGGKTCGAAGCGGGAGAAGCCGGTCTGGGTGCTGAACCCTG ^ 

1141 iiScncnccTccOTCCAGAGcSccccCT 

E N K E A K V S K R E K P V * V L N P E - 
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H PS H 

F 0 M I A ADPA I 

0 0 A N V VRUU N 
K E E F A AAW9 F 

1 13 11 2216 1 

ACCCGCCGATGTGGCAGTCTCTCCTCAGTGACTCCCCACAGGTCCTCCTGCAATCCAACA 

1260 

1201 TCCCCCCCTACACCCTCACACACCACTW 

ACMWQCLLSDSCQVLLESNI- 

S SA BHF BS H 

ANA HNCP SCNMAANXA RSD I A 

VLU PCRA PIUNMULHV SCO N I 

AA9 AIFL 1ADLH3A0A AAE D U 

236 2111 21211A421 111 3 1 

TCAAGGTTCTGCCCACATGCTCCACCCCGGTCCACGCGGATCCCGACGCTCAGTACTAAG 

1261 icnCCAAGACGGCTGTACCAGGK 

KVLPTiSTPVHADPE 

BS B 

H H SC HS S M JJ D S 

? A TR AT T N NOP 

H ENFEUY L L E M 

1 3 11 31 1 1 111 

CTTTCTGCGGCAGGCCAGGCCTGACCTTGGCmGGGGCACCGAGGCGGCTAAGGTGAGC ^ 

1321 CAAAGACCCCGTCCCCTCCGGACK 

B A BH B P 

BASHBHHNN P SG N ^ f J 

AHPHBAPAL A PI L JT L J 

NAMAEEHRA L U A Nl M A 

121112114 1 21 3 22 1 i 

CACGTCGCCCCAGCAGGTCCACACCCAATGCCCATCAGCCCAGACACTGGACCCT^ ^ 

1381 CTCCACCCCCCKCTCCACCTCTCCCT1K 

c BS S B SS B S FN 

N M SC DNHA H SMAAHNABSAC NS 

UN TR RLAU H TNUUAlPAPlR UP 

D L NF AAE9 A NL99EAAN1UF OB 

2 1 11 2436 1 11663412211 22 

TCCCCGACACnAACAACCCAGCCCCCTCTCCCCCTCGCCCCAGCTCU ^ 
1441 ACCGCCTCTCA>TTCTTGGGTCCC^ 
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F 


BSS 


BS 


NM 


S BMDMHNABSAA 


SCB 


UN 


T BBRNALPAPUU 


TRA 


41 


Y V0ALEAAN199 


NFM 


HI 


1 12213412266 


111 




/ / //// 


/ 





BH 


B NFS 


BS 


F 


BS 


M 


MSG 


MSB SNAH 


SC 


N 


SC 


N 


NPI 


NPB PUUA 


TR 


U 


TR 


L 


L1A 


L1V B49E 


NF 


4 


NF 


1 


121 


. 121 2H63 


11 


H 


11 




/ 




/ 




/ 



MS BNN 
AA ALL 
EC NAA 
32 134 

GCTCACATGGCACCACCTCTCTTGCAGCCTCCACCAAGGGCCCATCGCTCTTCCCCCTGG 

1501 I 560 

CCAGTGTACCCTCCTCGACAGAACCTCCGACCTGGTTCCCCCGTACCCACAAGCCGCACC 

ASTKGPSVFPLA- 



N 
L 
A 
4 

/ / / 

CACCCTCCTCCAAGACCACCTCTCGGCGCACAGCGCCCCTCCGCTGCCTCC7CAACGACT 

1561 * 1620 

GTCCCACCACGTTCTCCTCCAGACCCCCGTCTCCCCGCGACCCGACGCACCACTTCCTCA 

PSSKSTSCCTAALCCLVKDY- 

H M T H D 

PA T P D 

A E H H E 

2 3 1 1 1 

ACTTCCCCGAACCCGTGACGCTCTCGTCGAACTCACCCCCCCTGACCAGCGCCGTCCACA 

1621 ♦ 1680 

TCAAGGGGCnCGCCACTGCCACAGCACCTTCAGTCCCCGGCACTCGTCGCCCCACGTGT 

FPEPVTVSWNSCALTSGVHT- 

S H F B 

HNC DM I M ON M SM B 

PCR DS N N 0 U N TA B 

AIF ET F L E 4 t L EE V 

211 12 1 1 1 H 1 23 1 

// / I 

CCnCCCGGCTGTCCTACAGTCCTCAGCACTCTAC-rCCCTCAGCAGCCTGCTCACCGTCC 

16gl ♦ ♦ . — ♦ 1 ?4C 

GCAACGCCCGACACCATGTCAGGAGTCC7GAGA7CAGGGACTCCTCGCACCACTGGCACG 

FPAVLQSSGLYSLSSVVTVP- 

B F B B H 

SH N ASM B NSB MI 
PP U LTN A LPB A N 

1H 4 UXL N A1V E F 

21 H 111 1 421 2 1 

CCTCCAGCAGCTTGCGCACCCAGACCTACATCTGCAACGTGAATCACAAGCCCACCAACA 

17 41 1800 

CCACCTCCTCGAACCCGTGCGTCTGCATGTAGACGTTGCACTTACTGTTCGCCTCCTTCT 

SSSLCTQTYICNVNHKPSNT- 





NF 


A 


BH 


BANHBHN 


SN 


P 


SC 


AHAHBAL 


PU 


A 


PI 


NARAEEA 


B4 


L 


1A 


1211124 


2H 


1 


21 


/ // 
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M HM HM 

5 N ^ PN 

J L EL HL 

1801 ^rcircrGn'cmrAArcrcrcrcrGCrcScTCCCTCCCTCCCACACACGACCTT 

K V D K K V 

F BS SS F BS F 

*r THH F SC HHNCF N BSC N 

DE CHH t- oj. orrak U BTR U 

! S i s § i s : 

oc»ocac»ccocTcaocc{ocAcoc«TccccccT»Tcc»ccccc*CTcc«cccAGa i9jo 

1661 rcrcrjcrcrCMarcWac'cSro'o'c'cCCATACCTCCCCCTCACGTCCCCTCOT 

c S 

OBHMHNA HMNCM 

RBABPLU PNCRL 

AVE0HA9 21114 1 1312 

2132146 21114 

*CGCAGG&(GTCTGCCTCnCACCCGGAG^ 19£C 

1921 kcg^cc"c"g*c7a^ 

dc p B BS 

SC F M B N S SC 

to i A A L P TR 

NF t E NA 1 NF 

11 1 114 2 11 

GAGCGTCnCTGCCTTTTrCCclGGCTCTCGGCACCCACA?GCTAGGTGCCCa^ ^ 
1961 CTCCCAGAAGACCGAAAAACGGT^ 

B B S 

S f DBS S M HNC A 

DHA S Jg P N PCR V 

EX V EN1 M L A1F A 

g ! 122 1 1 "I 2 

CCCCCTCCACACAAACCCGCACCTGCTGCCCTCACACCTCCCAACACCCATATCCCCCAC ^ 

2041 rwccirGrcrcmrcrccrcrArcircrcrGrcTGwccGTTCTcccTATAGGcccTc 



19 



EP 0 325 262 A2 



DNPA 0 H D A V 

RLIU D A Shi 

AAM9 E . E f V V 

2416 1 3 111 

GACCCTCCCCCTGACCTAAGCCCACCCCAAAGCCCAAACTCTCCACTCCCTCAGCTCGGA 

2ioi — — ♦ ♦ ♦ * * 2150 

CTGGGACGGGCACTGGATTCGGGTGGGGTTTCCGGTTTCAGAGGTGAGGGAGTCCAGCCT 

H B 

I M MM P BS 

N N AB S AP 

F L EO T Nl 

1 1 32 1 22 

CACCrrCTCTCCTCCCACATTCCAGTAACTCCCAATCTTCTCTCTCCAGAGCCCAAATCT 
GTCCAAGAGAGGAGGCTCTAAGGTCATTC 

E P K S - 

N BBS BS 

M NS SSC SC HS M 

A LP PTR TR AT N 

F AH INF NF EU L 

3 31 211 U 31 1 

II II 
TGTCACAAAACTCACACATGCCCACCGTCCCCAGGTAAGCCACCCCACGCCTCCCCCTCC 



2161 



2221 



2280 



2340 



ACACTCTTTTGAGTCTCTACCGCTCCCACCGGTCCATTCGCTCGGGTCCCGAGCGGGAGG 
CDKTHTCPPCP 

B BS S S S 

AM B N SM F SC F DHNA HNC 

,N A L PA 0 TR A RALU PCR 

i I N A IE K NF N f AEA9 AIF 

1 \ 1 A 21 1 11 1 2346 211 

1 III 
ACCTCAACGCGGGACAGGTCCCCTAGAGTACCCTCCATCCACGCACACCCCCCACCCCGG 

2281 TCCAGTTCCCCCCTCTCCACCGGM 

BS S 

. u u M D M SC V ANA M 

{ T B NO N TR B VLU B 

LEO L E L NF 0 AA9 0 

3 2 2 1 1 1 « 2 246 ; 2 

TGCTGACACGTCCACCTCCATCTCTTCCTCAGCACCTCAACTCCTCCGCCGACW ^ 

2341 ACGACTGTGCAGGTGGAGGTA^ 

APELLCCPSV- 
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S ss 

y S AN M HW.ANNAC DM M 

N T UL N PNVCLUR DS A 

L Y 3A L ALA1A9F ET E 

ll' A3 1 2121461 12 3 

/ / // / 

CnCCTCTTCCCCCCAAAACCCAACCACACCCTCATCATCTCCCCGACCCCTCACCTCAC 
2401 CAACCACAACCGGCGT^GCCnCCTCTGGGACTACTAGACGCCCTCCGCACTCCACTG 

FLFPPKPKDTLMISRTPEVT- 

Jt U y DM M RM M 

LP A H OS B SA N 

AH E L ET 0 AE L 

Tl 2 1 12 2 12 1 

A7GCCTGCTGGTCGACCTGAGCCACCAACACCCTCACGTCAAGTTCAACTCCTACCTGCA 
2461 7ACGCACCACCACCTGCACTCGGTCCTT 



c V V v 



DVSHEOPEVKFNW YVD 



F FN 

M N NSS R MR 

N U UFA S AS 

L 4 DBC A E A 

1 H 222 1 2 1 

CGGCGTCGAGGTCCATAATCCCAAGACAAACCCCCCCGAGGAGCACTACAACACCACGTA 
2521 cCCGCACCKCArGTATTACCGTlCTGTTTCCGCGCCncnCGKATGTTGTCCTCCAT 

G V E V H N A K T K P R E E Q Y N S T Y - 

c BS 
HNC HH M SC | * 

PCR CP N ™ ? 

AIF AH L NF 

211 11 1 » 1 

CCCGGTCCTCAGCCTCCTCACCCTCCTGCACCACGACTGCCTGAATCGCAACCAGT^ ^ 

2581 GGCCCACCAGTCGCAGGAGTW 

RVVSVLTVLHQDWLNGKEYK- 

M T 
N A 
L Q 
1 1 

CTCCAAGCTCTCCAACAAACCCCTCCCAGCCCCCATCGAGAAAACCATCTCCAAAGCCAA ^ 

2641 CACCnCCAGAGCTlCmCCGCAW 

C K V S N K A L P A P I E K T I S K A K - 
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P S S s 

ADNNPMA A H M N HHN BSAH 

VRLLUNU U A N L APA GFUA 

AAAAML9 9 E L A EAE L19E 

2244116 631' 3 321 1163 

ACGTGGGACCCGTGGGCTGCCAGGGCCACATGGACAGAGGCCCGCTCCCCCCACCCTCTG 

2701 ♦ ♦ * * 2760 

TCCACCCTCGCCACCCCACGCTCCCGGTGTACCTCTCTCCGGCCGACCCCCGTCCGAGAC 

N F 
D M M S R « J J I 

DN A P S N U V B 

ELE BA L4A V 

113 2 1 1 H 1 1 

CCCTGAGAGTGACCCCTGTACCAACCTCTCTCCTACAGGCCACCCCCGAGAACCACACCT 

2761 ♦ ♦ 2820 

CCGAC7CTCACTGGCCACATCGTTCGACACAGGATCTCCCCTCCGGGCTCTTGCTGTCCA 

CQPREPQV- 

SS BS BS 

R F AHNNCCS A F SC SC 

S 0 VPCCRRM L 0 TR TR 

A K AAI1FFA U K NF NF 

1 1 1211111 1 1 11 11 

GTACACCCTGCCCCCATCCCCCCATGACCTGACCAACAACCACGTCACCCTGACCTGCCT 

2fi21 ♦ * ♦ 2860 

CATGTGGCACGGGGGTAGGGCCCTACTCGACTGGTTCTTGGTCCAGTCGCACTGGACGGA 

YTLPPSRDELTKNQVSLTCL- 

B F 
| N H 

p UP 

H 2 

GGTCAAAGGCnCTATCCCAGCCACATCGCCGTCGAGTCGGAGACCAATCGGCAGCCGGA 

2851 icAGmCCGAAGATAGGGTCCCTCT 

VKGFYPSDIAVEWESNCQPE- 



B 
B 
V 



H 

Ml M N H 

N N B L P 

L F 0 A H 

, 112 4 1 

GAACAACTACAAGACCACGCCTCCCGTGCTCCACTCCGACGCCTCCTTCTTCCTCTACAC ^ 

2941 cnGnCA7GnCTCCTGCGGACCCCACGAC 

NNYKTTPPVLDSDCSFFLYS- 
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B F S 

MA S m MBX NF M 

N L P UB ABM LA N 

LU M 40 EVN AN L 

11 1 H2 211 31 1 

/ 

CAAGCTCACCCTGCACAAGACCACGTGGCAGCAGGCGAACGTCnCTCATCCTCCGTGAT 

3001 ♦ « * ♦ * ♦ 3060 

CTTCGAGTGGCACCTCrrCTCGTCCACCGTCGTCCCCTTGCAGAACAGTACGAGGCACTA 

KLTVDKSRWQQCNVFSCSVM- 

S 

N N M M HNC 

SL B N PCR 

I A 0 L AIF 

13 2 1 211 

/ 

CCATGACCCTCTCCACAACCACTACACGCAGAAGAGCCTCTCCCTGTCTCCGCCTAAATG 

3061 * 3120 

CGTACTCCGACACGTCncCTCATGTGCGTCTTCTCCGACACCGACAGACGCCCATTTAC 



HEALHNHTTQKSLSLSPGK 



CXHHN 
FMAPA 
RAEAE 
13321 
/ 

ACTGCCACGCCCC 

3121 3133 

TCACGCTGCCCGC 
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FN S B 

N S B M H DHA S 

UP B • N G RAU T 

4 6 V L A AE9 X 

H 2 1 1 1 236 1 

/ 

GCCTCTTTGAGAAGCAGCCCGCAAGAAAGACGCAAGCCCAGACCCCCTCCCATTTCTGTG 

j ♦ * ♦ 60 

CCCACAAACTCTTCCTCCCCCCncmCTGCGTTCCGCTCTCCCGCACCGTAAACACAC 

B PS S S 

DBS ADNPA 0 DHNA M HM HNC 

DAP VRLUU D RALU N AN PCR 

EN1 AAAM9 E AEA9 L EL AIF 

122 22416 1 2346 1 31 211 

/ / // / / / 

GGCTCAGGTCCCTACTGCCTCACGCCCCTGCCTCCCTCCCCAAGCCCACAATCAACCCCC 

61 - — 120 

cccactccacgcatgacccagtccgggcacccagggagccgttccggtgttacttggccc 

V N R G - 

H F F 

I 6 N HH NMD 

N 6 U HA UNO 

F V 4 AE 4 L E 

1 1 H 12 H 1 1 

GAGTCCCTTTTAGGCACnGCrrCTCCTGCTCCAACTGGCCCTCCTCCCACCAGCCACTC 

]2i « * » • ♦ * 180 

CTCACCGAAAATCCGTCAACCAACACCACCACCrrGACCCCCACCACCGTCCTCCCTGAG 

VPFRHLLLVLQLALLPAATQ- 

B E E R A 

B C C SI 
V 0 0 * A U 

IKK 11 
AGGGAAACAAAGTGCTCCTGGGCAAAAAAGGGGATACACTCGAACTGACCTGTACACCTT 

181 « 24C 

TCCCTTTCTTTCACCACGACCCGI 1111 I CCCCTATGTCACCTTGACTGGACATGTCGAA 

GKKVVLGKKGDTVELTCTAS- 

H 

V M I 
B B N 
0 0 F 

2 2 1 

CCCAGAA^.ACACCATACAATTCCACTGCAAAAACTCCAACCACATAAACAT-TCTGCGAA 

2C\ 300 

GGG'CTlcnCTCGTATGTTAAGGTGACCTTTTTCAGCTTCCTCTATTTCTAACACCCTT 

Q KKSI QFHWKNSNQIKILCN- 
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B S S F H 

NES F AA A A N H I 

LAP 0 VU L U U H N 

422 1 ™ 1 A 2 1 1 



ATCAGGCCTCCT1CTTAACTAAAGGTCCATCCAACCTGAATGATCCCCCTGACK ^ 

TAGTCCCGAGGAAGAA^GAT^CCAGG^ 

QGSFLTKGPSKLN0RAD5RR- 



c S H H 

MANAS BA I A 10 

EVLUT CU N f 

CAA9Y L3 F L 

22461 1A 12 11 



GAACCCTTTGGGACCAACCAAACTTCCCCCTGATCATCAAGAATCTTAAGAT^ ^ 

mCGCAAACCCTGCnCCmGAAGCGW 

SLWOQGNFPL I I KNLKl 

5 M 
M M AMAM » 

B N VNUN J 

0 L AL9L J 

2 1 ?1 61 

CAGATACTTACATCTGTCAAGIGGAGGACCAGAAGGACGAGGTGCAATTGCTAG^ ^ 

^oa^ta^^ 

DTY I CEVEDQKEEVQLLVi-u 



D 



! 5 

1 1 1 

CAT1GAOCCCAACTCTGACACCCACCTGCTTCAGGGGCAGAGCCTGACCCTGK ^ 

' 81 CTAACTGACCGnCACACIGTCGGTGW 

LTANSDTHLLQGQSLTLTLE- 

B PS H 

ES SC D MIS 

AP TR D N NT 

Nl NF E I FY 

22 11 1 111 

AGAGCCCCCCTGCTAGTAGCCCCTCAGTCCAATGTAGCAGTCCAAGGCGTAAAAACATAC 

541 600 

TCTCGGGGCGACCATCATCGGCGACTCACGTTACATCCTCAGGTTCCCCATTTTrGTATG 

SPPCSSPSVQCRSPRCKNIQ- 
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N BBH S B BS 

M UD ASP A BSSGSC S B N SC 

B ND LPV L APTIAR T A L TR 

0 LE UBU U N1NACF X N A NF 

2 11 122 1 221111 1 1 4 11 

ACCGCCCCAAGACCCTCTCCOTGTCTCAGCTCCACCTCCACCATACTCCCACCTCGACAT 

601 ♦ ♦ 

TCCCCCCCnCTCCGAGAGGCACAGACTCCACCTCGAGCTCCTATCACCGTGGACCTGTA 

GGKTLSVSQLELQDSCT*TC- 
NS « NM A 



aS 0 ^ U 

« 2 11 1 



gcactctcttgcacaaccacaacaacgtccacttcaaaatacacatcgtggtcctagctt ^ 

661 CCTGACAGAACGTC^GGTCrimCC^ 

TVLQNQKKVEFKIOIVVLAF- 

HS MM 
AT N N 

EU L L 

31 11 

TCCACAAGGCCTCCAGCATAGKTATAAGAAACAGCGCGAACAGGTCGACnCTCCTTCC ^ 

721 * * 

AGGTCTTCCGGAGGTCGTATCAGATArrcrnCTCCCCCnGTCCACCTCAAGAGCAAGC 

QKASSIVYKKECEQVEFSFP- 

A AM 
L L N 

U U | L 

1 1*1 

CACTCGCCTTrACAGTTGUAAGCTCACGGCCAGTGGCCAGCTCTGGTGGCAGGCGGAGA 

7£< * 8 < C 

GTGAG^GCAAAlGTCAACTTnCGACTGCCCGTCACCCCTCGACACCACCGTCCGCCTCT 



L A 



FTVEKLTGSGELWWQAER 



P S 

H M FW A M 
P N LN U B 
H L ML 3 0 
1 1 11 A 2 

GGGCTTCCTCCTCCAAGTCnGGATCACCTTTCACCTGAACAACAACCAAGTCTCTGTAA 

84i — - ♦ 900 

CCCGAACCAGGAGGTTCAGAACCTAGTCGAAACTGGACTTCTTCnCCTTCACAGACATT 
ASSSKSWITFOLKNKEVSVK- 
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B BS PS 

SM SCADNPAD A * " 

TA TRVRLUJO L LP 

EE NFAAAV9E U u M 

23 11224161 1 11 

AACGCGnACCCAGCACCCTAACCTCCAGATGCGCAACAAGCTCCCGCK ^ 

901 ticcccaItSctcctc^ 

RVTdOPKLQWCKKLPLHLTL- 

BS BSS 
M SC HS D M H SCAHM 

N TR AT D N P TRUAN 

L NF EU E L H NF9EL 

1 11 31 1 11 » 63 J 

TCCCCCAGCCCTTCCCTCAGTATCCTGGC7CTCGAAACCTCACCCTCCCCCT1CAACCCA 

961 icCGCGTCCCGAACCGAGTCM 

PQALPQYAGSCNLTLALEAK- 

S BS 

F SC H D A 

A TR P D L 

N NF HE U 

1 11 111 

AAACACCAAAGTTCCAKAGCAACTGAACCTGCTGC7GATGACACCCACTCAGCTCCACA 

- n «- ^ _ „, - «• * * 1C80 

mGTCCrnCAACGTAGTCCTTCACrrCGACCACCACTACTCKGGTGAGTCCACGTCT 
TGK LHQEVNLVVyRATQLQK- 

PS s 

ADNNPA DF AM DE A 

VR' LUU DA IN OS L 

AAAAM9 EN Uf EP U 

224416 11 11 n 1 

AAAATTTGACCTG'GAGGTGT GCGGACCCACCTCCCCTAAGCTCATGCTCAG^ ^ 

1081 TT^AAACTGGACACTCCACAC^ 

NL TCEVWCPTSPKLMLSLKL- 

T h m oy 

A P N OS 

, Q A L ET 

J 1 2 1 12 

TGGACAACAAGGAGCCAAAGGTCKGAACCGGGAGAACCCGG1CTCCGTCCTGAK ^ 

1141 ACnCTTCTTCCKCGTTTCcIc^ 

ENKEAKVSKRE KPVWVLNPE- 



V 

h 

L 
1 



M 

N 
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H PS H 

F D W I A ADPA I 

0 D A N V VRUU N 
K E E F A AAM9 F 

1 13 11 2216 1 

/// 

AGCCGCGGATGTCGCAGTGTCTGCTGAGTGACTCGCGACAGGTCCTGC7GGAATCCAACA 

1201 ♦ * ♦ ♦ 1260 

TCCCCCCCTACACCGTCACAGACGACTCACTCAGCCCTGTCCAGGACGACCTTAGCTTCT 

A G M W Q C 



L L 


S 0 


S G Q V L 


L E S N I 


S 


SA 


BHF BS 


H 


ANA 


HNCP 


SGNWAANXA 


RSD I A 


VLU 


PCRA 


PIUNYULHV 


SCO N L 


AA9 


AIFL 


1ADLH3A0A 


AAE D U 


236 


2111 


21211A421 


111 3 1 


// 


// 


/ / / / 


/ 



TCAAGGTTCTGCCCACATCGTCCACCCCGCTGCACCCGCATCCCCAGGGTGAGTACTAAG 

1261 * < ♦ * ♦ 1320 

AGTTCCAAGACGGGTGTACCAGGTGGGGCCACGTGCGCCTAGGGCTCCCACTCATGATTC 



K 


V L 


P 


T W S 


T P V 


H A D P 


E 






E 




BS 


ss 


F 


BS 


F 


H 


CHH 


F 


SC 


HHNCF 


• N 


ESC 


N 


P 


OHA 


0 


TR 


PGCRA 


U 


BTR 


U 


H 


4AE 


K 


NF 


AAIFN 


4 


VNF 


4 


1 


712 


1 


11 


21111 


H 


111 


H 


/ 






/ 


II 




// 





CTTCAGCGCTCCTGCCTGGACCCATCCCCGCTATGCAGCCCCAGTCCAGGGCAGCAAGGC 
1321 - 1350 

caagtcccgacga:gcacctgcgtagcgcccatacgtcccgctcacgtcccgtcgt7ccg 
S s 

DBHVHNt HVNCN M |WNDM 

RBA5PLU PNCR. N NLDB 

AVEDHA9 A.1FA L LAEO 

2132U6 21114 1 1312 

// // // 
ACGCCCCGTCTGCCTCTTCACCCGGAGCC7CTGCCCCCCCCACTCATGCTCACGGAGAGG 

1381 * 1440 

TCCCCGGCAGACGCACAAGTGGCCCTCGGACACCGGCGGGCTGAGTACCAGTCCCTCTCC 

BS P B BS S 

SC F M B N S SCDHA 

TR L A A L P TRRAU 

NF M E N A 1 NFAE9 

11 1 114 2 11236 

/ / / 

GTCnCTCGCTTTTTCCCAGGCTCTCGGCAGCCACAGGCTAGGTGCCCCTAACCCACGCC 

1441 ♦ ♦ 1500 

CACAAGACCGAAAAAGGC7CCCAGACCCGTCCGTGTCCGATCCACCGCGATTCGGTCCGG 
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B B B S PS 

s DBS S M HNC ACNPA 

p DAP P H PCR VRIUU 

M • EN1 ML A1F AAAV9 

! 122 11 211 22416 

/ / / // 
CTGCACACAAACGGCCAGGTCCTGGCCTCAGACCTCCCAAGACCCATATCCGCGAGCACC 

1501 1560 

CACGTGTGTTTCCCCGTCCACGACCCCAGTCTGGACGGTTCTCGGTATAGGCCCTCCTCG 

D H DAW 

D A D L N 

E E E U L 

1 3 111 

CTCCCCCTGACCTAAGCCCACCCCAAACGCCAAACTCTCCACTCCCTCAGCTCCCACACC 

1561 • 1620 

CACCCGGACTGCATTCGCGTGCCCTnCCCGTTTGACAGGTCACGCAGTCCACCCTGTGG 

H B 

I M MM P BS W 

N N AB S AP A 

F L EO T Nl E 

1 1 32 1 22 3 

/ / 
nnCTCCTCCCAGAnCCAGTAACTCCCAATCTJCIOCTGCACAGCCCAAATCrrGTG 

1621 ~ • — - — « • • • — ♦ " - - » • ♦ 

AAGAGAGGAGCCTCTAAGCTCAnGACGGTTACAACACAGACCTCTCGCGTTTAGAACAC 



He: 



174C 



E P K S C D 

N EES BS 

NS SSC SC HS MA 

L P FTP TR AT NL 

AH JN= NF EU L U 

2-.1 11 31 11 

"/ / 1/ / 

ACi it A:-C*CtCi"GC(Ct::i"GCCCACGTAAGCCAGCCCACGCCTCGCCCTCCAGCT 

16£ " TCrrTlCAGlGTG^ACGGGTCGCK 

KTHTCPPCP 

B BS S S S 

M B N SM F SC F DHNA HNC 

N A L PA 0 TR A RALU PCR 

L N A IE K NF N AEA9 AIF 

l 1 4 21 1 11 1 2346 211 

/ / / 

CAAGGCCGCACAGGTGCCCTACAGTACCCTGCATCCACCGACACCCCCCACCCGCGTGCT 

1741 1600 

CTTCCGCCC1GTCCACGGCATC1CATCCCACGTAGCTCCCTGTCCCCGGTCGCCCCACCA 
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BS S 

AM M M D M SC VANAM 

FAB N D N TR B VLU B 

LEO L E L NF 0 AA9 0 

3 2 2 11 1 11 2 246 2 

II 

GACACGTCCACCTCCATCTCT7CCTCACCACCTCAACTCCTGCGCCGACCCTCACTCTTC 

1801 ♦ * * 1860 

CTGTGCACGTGGAGGTAGAGAAGGAGTCGTGGACTTGAGGACCCCCCTCGCAGTCACAAG 

APELLGCPSVF - 

S T.S N 

MS AN V. HMANNAC DM M NS 

NT UL N PNVCLUR DS A LP 

L Y 3A L ALAIA9F ET E AH 

11 A3 1 2121461 12 3 31 

I I II I I 
CTCT7CCCCCCAAAACCCAAGGACACCCTCATGA7CTCCCGGACCCCTGACGTCACATGC 

1861 * 1920 

CAGAAGGGGGGTTT7GGGTTCCTGTGGGAGTACTACAGGGCCTCGCCACTCCAGTCTACG 

LFPPKPKDTLMISRTPEVTC - 

RM M 
SA N 
AE L 
12 1 

GTCCTCCTGGACGTGAGCCACGAAGACCCtCAGGTCAACTTCAACTGCTACGTGGACGCC 

1921 ♦ - * 1980 

CACCACCACCTGCACTCGGTGCT1CTGGGACTCCAGTTCAAGTTGACCA7CCACCTGCCG 

VVVDVSHEDFEVKFNWYVDG 

S 

R M R HNC 

5 s A S PCR 
A * E A A1F 
1 2 1 211 

/ 

iAGCAGTACAACACCACCTACCGG 

1981 ♦ 2040 

CACCTCCACGTATTACGGTlCTGTTTCGCCGCCCTCCTCGTCATGrTCTCGTCCATGGCC 

VEVHNAKTKPREEQYNSTYR - 



K 


P K D T 


L M 


I 


M 


M 


DM 


M 


A 


N 


OS 


B 


E 


L 


ET 


0 


2 


1 


12 


2 






/ 





D 


P 


E 




r 


PK 


V 


\ 


NS: 


N 






L 


t 


DEC 


1 


H 


222 






// 
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BS 




HH 


M 


SC 


R 


CP 


N 


TR 


S 


AH 


L 


NF 


A 


11 


1 - 


11 


1 






/ 





G7CCTCACCCTCCTCACCGTCC7CCACCAGCAC7CGCTCAA7CCCAACCACTACAAGTCC 

2041 ♦ - --♦ - — 2100 

CACCAGTCGCAGCAGTCCCAGGACCTGGTCCTCACCCACnACCGTTCCTCATGTrCACG 

VVSVLTVLHQDWLNGKEYKC 

M T 
N A 
L Q 
1 1 

AACCTCTCCAACAAACCCCTCCCACCCCCCATCCACAAAACCATCTCCAAACCCAAAGCT 

2101 ♦ « 2160 

77CCACAGG77G7T7CGCCAGCC7CGGCCG7ACC7C77T7GG7AGACG7T7CGG7T7CCA 



V S N K 


A L 


P A 


P I 


E K 


T I S K A K 




P S 


S 








5 




ADNNPMA 


A 


H M 


N 


HHN 


BSAH 


D 


VRLLUNU 


U 


A N 


L 


APA 


GFUA 


D 


AAAAML9 


9 


E L 


A 


EAE 


LI9E 


E 


2244116 


6 


3 1 


3 


321 


1163 


1 


//// / 










/ 





GGCACCCGTGCGGTGCCAGGGCCACATCGACACAGGCCCGCTCCCCCCACCCTCTGCCCT 

2161 2220 

CCCTGGGCACCCCACGCTCCCGG7GTACCTGTCTCCCGCCCAGCCCCCTGCGAGACCGCA 







N 




w 


V 


S 


R 


N 


A 


P 


S 


L 


E 


B 


A 


1 


3 


2 


1 





F 










M 


N 


A 


B 


R 


F 


N 


U 


V 


B 


S 


0 


L 


4 


A 


V 


A 


K 



. . 1 H 1 1 11 

GAGAGTGACCGCTGTACCAACClCTGTCCTACACGGCAGCClCGACAACCACAGCTGTAC 

2221 ♦ ♦ ♦ 

CTCTCACTGGCGACATGGTTGGACACAGGATGTCCCCTCGGCGCTCrrGCTGTCCACATG 

GQPREP QVY • 
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SS BS BS B 

AHNNCCS A F SC SC S 

VPCCP.Rf.' L 0 TR •« P 

AA11FFA U K NF ^ M 

1211111 1 . 1 11 " 1 

ACCCTGCCCCCATCCCGCGATGAGCTGACCAAGAACCAGCTCACCCTGACCTGCCTGGTC 
TCGGACCGCCGTAGCGCCCTAOCW 

TLPPSROELTKNQVSLTCLV - 

F 

N H B 
UP B 
4 A V 
H 2 1 

AAACGCnCTATCCCACCCACATCGCCCTGCACTGGCACAGCAATCCCCACCCCGACAAC 
TTTCCGAAGATAGGGTCGCTGTAGCGGCACCTCACCCTCTCCTTACCCGTCGCCCTCTTG 
KGFYPSOIAVEWESNGQPEN - 

H 

Ml M N H MA 

N N B L P N L 

L F 0 A H LU 

112 4 1 11 

AACTACAAGACCACGCCTCCCGTGCTGGACTCCGACGCCTCCTTCTTCCTCTACAGCAAG 

2401 ^CATScTGGTGCGCAGGGCACCACCTGAGGCTGCCCACCAAGAAGGAGATCTCCrrC 



NYKTTPPV 

B 
S 
P 
M 
1 



L D 


S D G S 


F F L 




F 




S 




NM 


MBX 


NF 


M 


UB 


ABM 


LA 


N 


40 


EVN 


f AN 


L 


H2 - 


211 


31 


1 



Y S K 



N 

S 
I 
1 



CTCACCGTGGACAAGAGCAGGTGGCAGCAGCGGAACGTCTTCTCATGCTCCGTGATGCAT 

?461 ♦ * * 2520 

GAGTCGCACCTCT1CTCCTCCACCC7CCTCCCCTTGCAGAAGAGTACGACCCACTACGTA 

LTVDKSRWQQGNVFSCSVMH - 
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•5 


M. 


U 


HNC 


B 


N 


PCR 


D 


L 


AlF 


2 


1 


211 



GAGCCTCTGCACAACCACTACACCCAG*.AGAGCaCTCCCTGTCTCCCCCTAAATCAGTG 
2521 crcCGirACGTCnCGrGAicTicGrcTinCGCACAGGGACAGACGCCCAmACTCAC 
EALHNHYTQKSLSLSPGK. 

CXH 
FMA 
RAE 
133 
/ 

CGACGGCCG 

2581 2589 

GCTCCCCCC 



25 



30 



35 



40 



45 



50 



75 
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Table 3 

F K SB 
N S B W H DHA S 

UP B N G RAU T 

4 B VI A AE9 X 

H 2 1 1 1 236 1 

/ 

GCCTGTTTGACAACCAGCGCGCAAGAAAGACCCAACCCCACACGCCCTGCCATTTCTGTC 

1 ♦ * ♦ 60 

CGGACAAACTCrrCGTCCCCCGnCTTTCTGCCnCGGGTCTCCGCGACGGTAAAGACAC 

B PS S S 

DBS ADNPA 0 DHNA M HM HNC 

OAP VRLUU D RALU N AN PCR 

EN1 AAAV9 E AEA9 L EL A1F 

122 22416 1 2346 1 31 211 

/ / // / / / 

CGCTCAGCTCCCTACTGGCTCAGCCCCCTGCCTCCCTCGGCAACCCCACAATCAACCCGG 

61 120 

CCGAGTCCACCCATGACCCAGTCCCCCGACGCAGCCACCCGT7CCGGTGTTACTTGCCCC 

M N R G - 

H F F 

I B N HH NMD 

N B U HA UNO 

F V 4 AE 4 L E 

1 1 H 12 H 1 1 

GACTCCCTTTTACGCACTTGCTTCTGGTCCTGCAACTGGCCCTCCTCCCAGCAGCCACTC 

121 180 

C7CACCCAAAATCCGTCAACGAACACCACGACCTTCACCGCGAGCAGCGKCTCCGTGAG 

VPFRHLLLVLQLALLPAATQ- 

B E E R A 

B C C # S L 

V 0 0 A U 

IKK 11 

ACGCAAAGAAAGTGGTGCTGGGCAAAAAAGCGCATACACTCGAACTCACCTGTACAGCT7 
181 * • ♦ 240 

TCCCTTrCTTTCACCACCACCCGI I I I I ICCCCTATGTCACCnGACTGCACATGTCGAA 

GKKVVLGKKGDTVELTCTAS- 
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15 



20 



H 

= B " 
C ° * 

cccacaagaagagcatacaatkcac^ccaaaa ^ctcc AAC c^gat AAAGATTCTGCGAA ^ 

241 GGGIC^C^CICGIATG^AAGGTGACCTT^ 

Q K K 5 I Q F H V K N S N Q I K I L C N - 

B S 5 F H 

S ° ™ h 3 OAF 

S ? S ? 1 

ATCAGGGCTCCTICTTAACTAAAGCICCATCCAAGCTCAATCATCCCCCT^ ^ 

301 TACKCaACGA^ 

QGSFLTKGPSKLNORAOSRR- 

S S H H 

MANAS BA I J ID 

BVLUT CU N F 

0AA9Y L3 F L 

22461 1A 12 11 

CAAGCCTTTCGCACCAACGAAACrrCCCCCTCATCATCAAGAATCTTAACATAW ^ 

361 ^iwUIcccTccnccmcAACcccc^ 

SLWDQGNFPLIIKNLK1EDS- 

S 

M M AM AM * 

B N VNUN % J 

0 L AL9L E 

40 2 1 2161 

CAGATACTTACATCTGTGAAGTGGACGACCAGAAGGAGGAGGTGCAATTCn ^ 

421 GTCTATGAATGTAGACAC^CACCTC^ 

0 T Y I C E V E 0 Q K E E V 5 L L V F G - 



30 



35 



50 
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! t 

M Y 

1 1 
GATTCACTGCCAAaCTGACACCCACCTCCnCAGGGGCAGAGCCTGACCaGACCTTGG ^ 

481 CTAACTGACGG^GAGACTGTCGG^CGK 

L 7 A N S D T H L L Q C Q S L T L T L E - 

B ES " c 

B5 SC D I I S 

AP TR D M J T 

Nl NF E L F Y 

22 11 1 111 

AGAGCCCCCCTCGTAGTAGCCCCTCAGTGCAATGTAGGAGTCCAAGGGGTAAAAACATAC ^ 

541 TCTCGGGGGGACCATCATCGGGG^ 

SPPGSSPSVQCRSPRCKNIQ- 

N BBH S B BS 

y MO ASP A B5SG5C S B H SC 

B ND IPV L APTIAR T A L TR 

0 LE UBU U N1NACF X N A NF 

2 11 122 1 221111 1 1 4 11 

AGCCCGCCAACKCCTCTCCCTCTCTCAGCK 660 

601 TCCCCCCCTICTGGGAGAGGCAC^ 

GGKTLSVSQLELQDSGTWTC- 



N 
NS 
LP 
AH 
31 

GCACTGTCTTGCAGAACCAGAAGAACGTGGAGT7CAAAATAGACATCGTGGTCCTAGCTT ^ 

661 



y NM A 

B | HA L 

0 EE U 

2 11 1 



CGTGACAGAACGTCnGGTCTlCTTCCACCTCAACTTTTATCTGTAGCACCACGATCCAA 
TVLQNQKKVEFKIDIVVLAF- 
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HS MM 
AT N N 

EU L L 

31 1 1 

TCCACAACCCCTCCACCATAGTCTATAAGAAAGAGCCCCAACACCTCCACnCTCCTTCC 

72i * - * ♦ 780 

AGGTCTTCCGGAGGTCGlATCAGATAriCTTTCTCCCCCTlCTCCACCTCAAGACGAAGG 

QKASSIVYKKECEQVEFSFP- 

A AM 

L L N 

U U L 

1 1 1 

CACKGCCTTTACACTTCAAAACC7GACCCGCACTCGCCACCTGTCGTCCCACCCGCACA 

781 840 

CTCAGCGCAAATGTCAACTTnCGAC"CCCCG T CACCGCTCGA;ACCACCGTCCCCCTCT 

LAFTVEKLTCSCEl*»QAER- 

P S 

H M FM A M 

P N LN (J B 
H L ML 3 0 
1 1 11 A 2 

GGGCnCCTCCTCCAAGTCTTCGATCACCTTlCACCTGAACAACAACGAAGTGTCTCTAA 
8<1 --♦ 900 

CCCGAAGCAGGAGCTTCACAACCTAGTCGAAACTCCACnCTTCTTCCnCACACACATT 

ASSSKSWITFDLKNKEVSVK- 



A H 
L P 
U H 
1 1 
t 

AACCGGnACCCACCACCCTAAGCTCCACATCCCCAAGAAGCTCCCCCTCCACCTCACCC 

9C1 ♦ * -- ♦ 960 

HGCCCAATGCGKCTCCCAnCGAGClCTACCCCnCTTCGACGCCCACCTCGAGTCGG 

H I T I - 



6 


BS PS 




SM 


SCADNFAO 


A 


TA 


TRVRlUUD 


I 


EE 


NFAAAV9E 


U 


23 


11224161 


1 


/ 


/ / // 





R 


V 


T Q 


D P 


K I 


Q M G K K 


L P L 




BS 










BSS 


M 


SC 


HS 


D 


M 


H 


SCAHM 


N 


TR 


AT 


0 


N 


P 


TRUAN 


L 


NF 


EU 


E 


L 


H 


NF9EL 


1 


11 


31 


1 


1 


1 


11631 




/ 


/ 








/ / 



TGCCCCAGGCCTTGCCTCACTATGCTCCCTCTGCAAACCTCACCCTCGCCCTTGAACCGA 

961 ♦ ♦ • --* 1020 

ACGCGGTCCCGAACGGAGTCATACGACCCAGACCTTTGCAGTCCCACCCCCAACnCGCT 

P5ALPQYAGSGNLTLALEAK- 
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S BS 

F SC HO A 

A ' TR POL 

N NF HE U 

in ill 
/ 

AAACACGAAAGTTCCATCACCAACTCAACCTGGTGGTCATCAGAGCCACTCACCTCCAGA 

1021 - 1080 

TTTGTCCTTTCAACGTACTCCnCACTTGGACCACCACTACTCTCCGTGAGTCGAGGTCT 

TCKIHQEVNLVVMRATQIQK- 

PS S 

M ADNN»A OF AM DE A 

N VRlLUU DA IN OS L 

L AAAAVS EN UL E? U 

1 22"16 12 11 11 1 

///// / / / 

AAAATTTGACCTCTGAGGTGTCCGGACCCACCTCCCCTAACCTGATCCTGAGCTTCAAAC 

1081 1140 

TTTTAAACTGCACACTCCACACCCCTCGCTCCAGGCCAnCGACTACGACTCGAACrrTG 

NLTCEVWGPTSPKLMLSLKL- 

M T H M DW 

N A P N DS 

I Q A L ET 

1 1 2 1 12 

/ 

TCCAGAACAACCAGCCAAACGTC7CCAAGCGCGACAACCCCGTG7GCCTCCTCAACCCTC 

1141 4 ♦ 1200 

ACCTCTrGnCCTCCGTT7CCACACCnCGCCC1CTTCCGCCACACCCACGACnGCGAC 

ENKEAKVSK RE KPVWVLNPE- 

H PS * H 

F 0 M I A ADPA I 

0 D A N V VRUU N 
K E E F A AAV9 F 

1 13 11 2216 1 

/// 

AGCCGCCGATCTGCCAGTG7CTCCTCAGTCACTCCGCACAGCTCCTGCTGCAATCCAACA 

1201 ♦ ♦ ♦ 1260 

TCCGCCCCTACACCCTCACACACGACTCACTCAGCCCTCTCCAGGACGACCnACCnCT 

ACMWQCLLSDSGQVLLESNI- 
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s 


SA 


BHF BS 


ANA 


HNCP 


SCNWAANXA 


VI U 


PCRA 


PIUNMULHV 


AA9 


A1FL 


1ADLH3A0A 


236 . 


2111 


212UA421 


// 


// 


1 1 1 1 



H 

RSD I A 
SCD N L 
AAE D U 
111 3 1 
/ 



1261 



1320 



ACmCAACACCCCTGTACCACCTGCCCCCACCTCCGCCTACCCCTCCCACTCATCATTC 



K 


V L 


P 


T W S 


T P V H 


A 0 P 


E 






E 




BS 


SS 


F 


BS 


F 


H 


CHH 


F 


SC 


HHNCF 


N 


BSC 


N 


P 


OHA 


0 


TP. 


PCCRA 


U 


BTR 


U 


H 


4AE 


K 


NF 


AA1FN 


4 


VNF 


4 


1 


712 


1 


11 


21111 


H 


111 


H 


/ 






/ 


II 




// 





CTTCAGCGCTCCTGCCTGGACCCA1CCCCCCTATGCAGCCCCAGTCCAGGGCAGCAAGCC 

1321 ♦ 1380 

GAACTCGCGAGGACGGACCTGCGTACCGCCCATACCTCGGGGTCACGTCCCCTCGTrCCG 

S S 

DBWHNt h».*n;n v K'\y: 

R5ABPLU PNCR. N K'.Dr 

AVE0H.A9 AL1FA L LAEj 

2132146 2111^ 1 1312 

// // // 
AGGCCCCCTCTGCCTCTTCACCCGGAGCCTCTGCCCGCCCCACTCATGCTCAGGGAGAGG 

1361 1*40 

TCCCGGGCAGACGGAGAAGTGGCCCICGCAGACCGGCGGGCTCAGTACGAG'ICCCTCTCC 

BS P B BS S 

SC F M B N S SCDHA 

TR L A A L P TRRAU 

NF M E N A 1 NFAE9 

11 1 1^42 11236 

/ / / 

GTCnCTCGCTTTTICCCAGGClCTGGGCAGGCACAGGCTAGGTGCCCCTAACCCAGGCC 

1441 ♦ ♦ » ♦ * » 1500 

CAGAAGACCGAAAAAGCCTCCCACACCCGTCCCTGTCCGATCCACGGCGATTGCGTCCGG 



B 


6 


B 




S 


PS 


S 


DBS 


S 


U 


HNC 


ADNPA 


P 


DAP 


P 


N 


PCR 


VRLUU 


M 


EN1 


M 


L 


AIF 


AAAM9 


1 


122 


1 


1 


211 


22416 




/ 






/ 


/ // 



CTGCACACAAACCGGCAGGTGCTGGCCTCAGACCTGCCAACAGCCATATCCCCCAGCACC 

1501 1560 

GACGTGTGTTTCCCCGTCCACCACCCCACTCTGGACCCTTCTCCGTATACCCCC7CCTCG 
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0 H DAM 
D A D L N 

E E E U I 

1 3 111 
CTGCCCCTCACCTAAGCCCACCCCAAAGCCCAAACTCTCCACTCCCTCACCTCGGACACC 

1561 ♦ * - ♦ 1! 

CACGGCCACTGGATTCGGGTGGGGTT7CCGGTTTGAGAGGTGAGGGAGTCGAGCCTCTGG 

tPLT.AHPKGQTLHSLSSDT - 
CP-PKPTPKAKLSTPSARTP- 
APDLSPPQRPNSPLPQLG KL- 

H S 

I M MM DP F 

N N AB DO A 

F L EO EK N 

1 1 32 11 1 

/ / 
nCTCTCCTCCCACAnCCAGTAACTCCCAATCTTCTCTCTCAGGGACTCCATCCCCCCC 

1621 ♦ ♦ 1 

AAGACAGGAGGGTC7AAGGTCATTGAGGGTTAGAACACAGAGTCCCTCACGTAGGCCCCG 



G S A S A P - 



c 

C 

N 0 
L R 
1 1 

AACCCTTncCCCCTCGTCTCCTGTGACAAncC. . . . 

1681 * 1714 

T7GGGAAAAGGCGCAGCAGACCACAOCTTAACC. . . . 

TLFPLVSCENS .... 
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Table 4 



FN SB 
N S B M H DHA S 

UP B N G RAU T 

4 B V L A AE9 X 

H 2 11 1 236 1 

/ 

GCCTGTTTCAGAACCAGCCGCCAAGAAAGACGCAAGCCCAGACCCCCTGCCATTTCTGTG 

1 * 60 

CGGACAAACTCnCGTCGCCCCTlCTTrCTGCGnCGCGTCTCCGGGACGGTAAAGACAC 

B PS S S 

DBS ADNPA D DHNA V HW HNC 

DAP VRLUU D RALU N AN PCR 

EN1 AAAV9 E AEA9 L EL AJF 

122 22416 1 2346 1 31 211 

/ / // / / / 

GGCTCAGGTCCCTACTGGCTCAGGCCCCTGCCTCCCTCGGCAAGCCCACAATGAACCGGG 

61 120 

CCGAGTCCAGGGATGACCGAGTCCGGCGACCCAGGGACCCGTTCCGCTGTTACnCCCCC 

M N R G - 

H F F 

I B N HH NMD 

N B U HA U N D 

F V 4 AE 4 L E 

1 1 H 12 H 1 1 

GAGTCCCTTTTAGGCACTTGCnCTGCTGCTGCAACTGGCGCTCCTCCCACCAGCCACTC 

121 * 180 

CTCAGGCAAAATCCGTCAACGAAGACCACGACGTTGACCGCGAGGACGGTCGTCGGTGAG 

VPFRHILLVIQLA|LLPAATQ- 

B E E R A 

B C C S L 

V 0 0 A U 

IKK 11 

AGGCAAAGAAAGTGGTCCTGGCCAAAAAACCGCATACAGTGGAACTGACCTCTACAGCTT 
181 -- ♦ 2A0 

TCCCTT7CTT7CACCACCACCCCI I I I I ICCCCTATGTCACCTTCACTCGACATCTCCAA 

CKKVVLCKKCDTVELTCTAS- 
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H 

M M I 
B B N 
0 0 F 

2 2. 1 
CCCAGAAGAAGACCATACAATTCCACTGGAAAAACTCCAACCA:ATAAASAnCTGGCtt 

241 300 

GGGTCTTCnCTCGTATGTTAAGGTGACCTTTTTGAGGTTGGlCTATTTCTAACACCCTT 



K S 


I Q F 


H W K 


N S 


N Q 


I 


K 


I 


B 




S 




S 


F 




H 


NBS 


F 


AA 


A 


A 


N 


H 


1 


LAP 


0 


VU 


L 


U 


U 


H 


N 


AN1 


K 


A9 


U 


3 


D 


A 


F 


422 


1 


26 


1 


A 


2 


1 


1 


/ 




/ 













ATCAGGGCTCCrrcnAACTAAAGGTCCATCCAAGCTGAATGATCGCGCTGACTCAAGAA 

301 360 

TAGTCCCGAGCAAGAAnGATTrCCAGGTAGGTTCGACTTACTAGCGCGACTGAGncn 



S F L T K 


G P S K 


L N D 


RAD 


S R 


1 


S 


5 


H 




H 




MANAS 


BA 


I 


A 


I 


D 


BVLUT 


CU 


N 


F 


N 


D 


0AA9Y 


L3 


F 


I 


F 


E 


22^61 


1A 


1 


2 


1 


1 


/ 


/ 











GAACCCTTTGCGACCAACGAAACTTCCCCCTCATCATCAACAATCTTAACATACAAGACT 

361 ♦ 420 

CTTCGGAAACCCTGGTTCCTTTCAAGGGCGACTACTAGTTCTTAGAATTCTATCTTCTGA 

SLWDQGNFPLI IKNLKIEDS- 

S 

M M AW AM M 

B N VNUN f A 

0 L AL9L E 

2 1 2161 1 

// 

CAGATACTTAGATCTGTCAACTCCACCACCAGAACCACCACCTGCAAT7CCTAC7CrrCG 

421 — . --•» ♦ ♦-- •» ♦ 480 

GTCTATCAATCTAGACACnCAt:CTCCTCGTCnCCTCCTCCACCTTAACCATCACAACC 

DTYICEVEDQKEEVQLLVFG- 
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B 

S S 

P T 

M Y 

1 1 
GArrCACTCCCAACTCTCACACCCACCTGCTTCAGCCCCAGAGCCTGACCCTCACCTTCG 

481 5 '° 

CTAACTGACCCTTGACACTGTCGCTGGACGAAGTCCCCCTCTCGCACTGCGACTGCAACC 

LTANSDTHLLQCQSLTltLE- 

g gc K 

BS SC D WIS 

AP TR D N' NT 

Nl NF E L FY 

22 11 1 111 

AGAGCCCCCCTGGTAGTAGCCCCTCAGTGCAATGTAGGAGTCCAAGCGGTAAAAACATAC 

541 - 600 

7CTCGGGCCGACCATCATCCGGGAGTCACGT7ACA7CCTCAGGnCCCCATTTTTGTATG 
SPPGSSPSVQCRSPRCKN1Q- 

N BBH S B BS 

M MD ASP A BSSCSC S B N SC 

B ND LPV L APT1AR T A L TR 

0 LE UBU U N1NACF X N A NF 

2 11 122 1 221111 1 1 4 11 

AGGGGGGGAAGACCCTCTCCGTGKTCAGCTCGAGCTCCAGGATAGTGCCACCTGGACAT 

601 TCCCCCCCTTCTCGCAGAGGCAU 

CCKTLSVSQLELQDSCTKTC- 

« R HA L 

AH 0 I ^ U 

31 ^ 11 1 



GCACTCTCTTCCAGAACCAGAACAAGGTGGAGTTCAAAATACACATCGTGGTGCTAGCTT ^ 

CGTCACAGAACGTCTTGGTCT^CTT 

TVLQNQKKVEFKlDIVVLAf- 



HS MM 
AT N N 

EU L L 

31 1 1 

TCCAGAAGGCCTCCAGCATAGKTATAAGAAACACGGCGAACACCTCCAGTTCK ^ 

721 AGGTCTTCCGGAGGTCGTATCAGAT^ 

Q K A S S I V Y K K E G E Q V E F S F P - 
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A AM 
L L N 

U U L 

1 1 1 

CACTCGCCTTTACAGT7CAAAAGCTGACGGGCAGTGGCGAGCTGTGGTCCCAGGCGCAGA 

78! , . ♦ e ^ 

CTGAGCCGAAATGTCAACTTrTCGACTGCCCCTCACCCCTCGACACCACCGTCCGCCTCl 
LAF7VEKLTGSC£L**QA£R- 

P S 

H M FM A V 
P N IN U B 
H L ML 3 0 
1 1 11 A 2 
CGGCTTCCTCCTCCAAGTCTTCGATCACCTTTCACCTGAACAACAAGGAACTGTCTCTAA 

841 -* 900 

CCCGAAGGACGACGTTCAGAACCTAGTGGAAACTGGACTTCncnCCTTCACAGACATT 

ASSSKSWITFOLKNKEVSVK- 
B BS PS 

SM SCADNPAD A AH 
TA TRVRLUUD L LP 
EE NFAAAM9E U U H 

23 11224161 1 1 1 

AACGGGnACCCAGGACCCTAACCTCCAGATCGGCAACAAGCTCCCGCTCCACCTCACCC 

901 960 

TTGCCCAATGGGTCCTGGGAnCGACGTCTACCCGnCTTCCACGGCGAGGTGGAGTGGG 

RVTQDPKLQMCKKLPLHLTL- 

BS BSS 

M SCHS D M H SCAHM 

N TR AT 0 N P TRUAN 

L NF EU E L H NF9EL 

1 11 31 1 11 1^31 

/ / * f 

TGCCCCACGCCTTGCCTCAGTATCCTGGCTCTGGAAACCTCACCCTGGCCCTTGAAGCGA 

961 ♦ * 1020 

ACGGGCTCCGCAACGCAGTCATACGACCGAGACCTTTGGAGTCGCACCGCGAACnCGCT 

PQALPQYACSCNLTLALEAK- 

S BS 

F SC H 0 A 

A TR P D L 

N NF HE U 

1 11 111 

/ 

AAACACCAAAGTTCCATCACGAAGTCAACCTCCTCCTCATCACACCCACTCACCTCCACA 

1021 ♦ * * 1° 80 

TTrCTCCTTTCAACGTAGTCCTTCACTTGGACCACCACTACTCTCGGTGACTCCAGGTCT 

TCKLHQEVNLVVMRATQLQK- 
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PS 


S 








M 


ADNNPA 


DF 


AM 


DE 


A 


N 


VRLLUU 


DA 


IN 


DS 


I 


L 


AAAAVS 


EN 


UL 


EP 


U 


1 


224416 


11 


11 


11 


1 




///// 


/ 


/ 


/ 





1081 



aaaatttgacctgtgaggtctgcggacccacctcccctaagctgatgctgagcttcaaa; 

ttttaaactggacactccacacccctgcctggaggggattccactacgactcgaactttc 
nltcevwcptspklmlslkl- 



M T H M DV 

N A P N DS 

L Q A L ET 

1 1 2 1 12 

TGCAGAACAAGGAGGCAAAGGTCTCCAAGCGCGAGAAGCCGCTCTGGGTGCTGAACCCTG 

1141 « 1200 

ACCTCnGTTCCTCCGTTTCCACAGCTlCGCCCTCnCGGCCACACCCACGACTTCGGAC 
ENKEAKVSKREKPVWVLNPE- 

H PS H 

P D M I A ADPA I 

0 D A N V VRUU N 
K E E F A AAV9 F 

1 13 11 2216 1 

/// 

AGGCGGCGATGTGGCAGTGTCTGCTGAGTGAC7CGGCACAGGTCCTGCTGGAATCCAACA 

1201 ♦ 1260 

TCCGCCCCTACACCGTCACAGACGACTCACTGAGCCCTGTCCAGGACGACCTTAGGTTGT 
ACMWQCLLSDSCQVLLESN1- 

BHF BS H 

SCNVAANXA RSD I A 

PIUNMULHV SCD N L 

1ADLH3AJA AAE D U 

21211A421 111 3 1 

/ / / / / 
TCAAGGT1CTGCCCACATGGTCCACCCCGGTCCACGCCCATCCCGAGGGTGAGTACTAAG 

1261 1320 

AGTTCCAAGACCGGTCTACCAGCTCGGCCCACCTGCCCCTACGCCTCCCACTCATGATTC 



H 
P 
H 
1 

CTTCACCGCTCCTCCCTGGACGCA1CCCGGCTATCCACCCCCACTCCACCGCACCAACGC 



S 


SA 


ANA 


HNCP 


VLU 


PCRA 


AA9 


A1PL 


236 


2111 


// 


// 



V L 


P 


T W 


S T P V 


HAD 


P E 




E 




BS 


ss 


F 


BS 


F 


CHH 


F 


SC 


HHNCF 


N 


BSC 


N 


OHA 


0 


TR 


PGCRA 


U 


BTR 


U 


4AE 


K 


NF 


AAIFN 


4 


VNF 


4 


712 


1 


11 


21111 


H 


111 


H 






/ 


II 




II 





1321 



GAAGTCCCCAGCACCGACCTGCCTACGCCCCATACGTCGCCCTCAGCKCCCTCGnCCG 
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S s 



DBHIHNA 



HMNCN . M VND'. 



RBABPLU PNCRL N N Dd 

AVE0HA9 AL1FA L L.AEC 

2132146 21114 1 1312 

AGGCCCCGTCTGCCTCTTCACCCGGAGCCTCTGCCCGCCCCACTCATCCTCAGGGAGACG 

1381 * 

TCCCGGCCAGACGGAGAACTGGGCCTCGGAGACGGGCCGGGTGACTACCAGTCCCTCTCC 

BS P B BS S 

cr F M B N S SCDHA 

T R L A A L P TRRAU 

NF M E N A 1 NFAE9 

11 1 114 2 U236 

/ I I 

CTCnCTGCCmTrCCCAGGCTCTGGGCACGCACAGCCTAGCTGCCCCTAACCCAGCCC ^ 

1441 CACAAGACCGAAAAAGGGTCC^ 

B B B S PS 

I DBS S M HNC AONPA 

p DAP P N PCR VR^UU 

M EN1 M L AIF AAAV9 

, 122 11 211 22416 

/ / / // 
CTGCACACAAAGGGGCAGGTCaCGGCTCAGACCTGCCAAGAGCCATATCCGGGAGGACC 

.... # # . . * «» ♦ * 155C 

GACGTG7GTTTCCCCGTCCACGACCCGAGTCTCGACGGTTCTCGGTATAGGCCCTCCTGG 

0 H DAM 
D A D L N 
E E E U L 

1 3 111 
CTGCCCCTGACCTAAGCCCACCCCAAAGGCCAAACTCTCCACTClCTCACCTCGGACACC 

1561 . . ♦ I 620 

GACGGGGACTGGATTCGGGTCGGGTnCCGGTTTCAGAGGTGAGGGAGTCGAGCCTGTCG 

H F 

I M MM BP DE AN 

N N AB BS OS LU 

F L EO VT EP U4 

1 1 32 11 11 1H 

/ / / 

■nCTCTCCTCCCAGAnCCAGTAACTCCCAATCTTCTCTCTCCAGTGATTCCTGAGCTCC 

16 21 ♦ * * * 1633 

AACAGAGCACGCTCTAACGTCATTCAGGCTTAGAACACACACCTCACTAACCACTCCACC 

V I A E L P - 
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F 

V H V V N 

B C N B U 

0 A L 0 D 

2 2 2 

CTCCCAAACKAGCCTCTTCCTCCCACCCCGCGACGGCrrCTTCGGCAACCCCCGCAAGT ^ 

1681 wcCCmCACKCCACUCCACCCTCCCGW 

PKVSVFVPPRDCFFGKPRKS- 

BS S H B S F 

» cr H HNC I B SMC N 

, i PCR N B TNR U 

U NFE AIF F V NLF 4 

1 11 3 211 1 1 HI H 

/ // II 

CCAACCTCATnCCCACCCCACCCGTTrCAGTCCCCCGCAGAnCACGTGTCCTCCCTGC 

1741 wnCGAGTAGACGCTCCCGTGCCCAAAGTCAGCCGCCCTCTAAGTCCACAGGACCCACC 

K L I C Q A T G F S P R Q 1 Q V S W L R - 

F B S BS H 

uu c H H AV AA SCM D H I 

H PC HA VU TRN D A N 

DA M HA AE A9 NFL E E 

21 i 1 1 23 26 111 13 1 

II 

GCGAGGGCAACCAGCTGGGGTCTGGCGTCACCACGCACCAGGTCCACGCTCACGCCAAiC ^ 

1801 CCCTCCCC^CCTCCACCCCACACCCCAGTW^ 

ECKQV CSCVTTDQVQAEAKE- 

S5 B B 

AAHNABS SM ■ 1 

UUALPAP TA P 

99EAAN1 EE H 

6634122 23 1 

ACTCTCGGCCCACGACCTACAAGCTGACCACCACACTGACCATCAAACAC .... 

1861 TCACACCCCCCTCCTGCATCnCCA • • 

SCPTTYKVTSTLT1KE ... 
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Table 5 

FN SB 
N S B M H DHA S 

UP B N C RAU T 

4 B V L A AE9 X 

H 2 1 1 1 236 1 

/ 

CCCTGTTTCACAAGCAGCCCCCAACAAAGACGCAACCCCACACCCCCTCCCATTTCTCTC 

1 ♦ « 60 

CGGACAAACTCTTCGTCGCCCGTTCTTTCTGCGTTCGGGTCTCCGGCACCGTAAAGACAC 

B PS S S 

DBS ADNPA 0 DHNA M HM HNC 

DAP VRLUU D RALU N AN PCR 

EN1 AAAM9 E AEA9 L EL AIF 

122 22416 1 2346 1 31 211 

/ / // / / / 

GCCTCACGTCCCTACTCCCTCACCCCCCTCCCTCCCTCCGCAACGCCACAATGAACCGGG 

61 • ♦ 120 

CCGAGTCCAGCGATGACCGAGTCCCCGGACGCAGCGAGCCGTTCCGCTGTTACTTCGCCC 

M N R G - 

H F F 

I B N HH NMD 

N B U HA U N D 

F V 4 AE 4 L E 

1 1 H 12 H 1 1 

GAGTCCCTTTTAGCCACTTGCTTCTGCTGCTCCAACTCGCGCTCCTCCCAGCAGCCACTC 

121 ♦ 180 

CTCAGCGAAAATCCCTGAACGAACACCACGACGTTGACCGCGACGAGGGTCGTCGCTGAG 

VPFRHLLLVLQLALLPAATQ- 

B E E # R A 

B C C " S L 

V 0 0 A U 

IKK 11 

AGCGAAAGAAAGTGGTGCTGGCCAAAAAAGGGGATACAGTGGAACTGACCTCTACAGCTT 

181 ♦ — ♦ 240 

TCCCTTTCTTTCACCACGACCCCI Nil ICCCCTATGTCACCTTGACTCGACATCTCGAA 

CKKVVICKKCDTVELTCTAS- 

H 

MM I 

B B N 
0 0 F 
2 2 1 

CCCAGAAGAACAGCATAC AATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGCAA 
241 ♦ 3CC 

GGGTCTlCTlCTCGTATcriAAGGTCACCTTTTTCAGCnGGTCTATTTCTAAGACCCTT 

QKKS1QFHWKNSNQIKILGN- 
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s 


F 




H 


A 


A 


N 


H 


I 


L 


U 


U 


H 


N 


U 


3 


D 


A 


F 


1 


A 


2 


1 


1 



E S 

NES F A A 

LAP 0 VU 

AN] K A9 

422 1 26 

ATCAGCGCTCCrrcnAACTAAAGGTCCATCCAAGCTGAATGATCGCGCTCACTCAAGAA 

301 * * * - 36C 

TAGTCCCGACGAAGAATTGATTrCCAGGTAGGnCGACTTACTAGCGCGACTGAGTTCTT 

QCSFITKCPSKLNDRADSRR- 

S S H H 

MANAS BA I A ID 

BVLUT CU N F ND 

0AA9Y L3 F L F E 

22461 1A 12 11 

/ / 
GAAGCCTTTCGGACCAAGGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

361 <20 

CTTCGGAAACCCTGGTTCCTTTGAAGGGCGACTAGT AGTTCTTAGAATTCTATCTTCTGA 

SL WDQGNFPLI I KNLKIEDS- 

S 

M M AVAV M 

B N VNUN A 

0 L AL9L E 

2 1 2161 1 

// 

CACATACHACATCTCTCAACTCCACCACCAGAACCACGACGTGCAAT7GCTACTGTTCG 

421 -- ♦ ♦ ♦ ♦ 480 

GTCTA1GAATGTAGACACTTCACCTCCTCGTCTTCCTCCTCCACGTTAACGATCACAACC 

DTYlCEVEDQKEEyQLLVFG- 

B 

S S 
P T 
M Y 
1 1 

GATTGACTGCCAACTCTGACACCCACCTGCT7CACGGGCACAGCCTGACCCTGACCTTGG 

481 54C 

CTAAOCACGCnGACACTCTCCCTCCACGAACTCCCCGTnCGCAOGCCACTCGAACC 

LTANSDTHLLQGQSLTLTLE- 
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8 BS H 

BS SC D MIS 

AP TR D N NT 

M NF E L FY 

22 11 1 .111 

AGACCCCCCCTGGTAGTAGCCCCTCACTCCAATCTACCAGTCCAACGCCTAAAAACATAC 

541 ♦ 60S 

TCTCGGGCCGACCA7CATCGGGCAGTCACGTTACATCCTCACGTTCCCCATTTTTGTATC 
SPPGSSPSVQCRSPRGKNIQ- 

N BBH S B BS 

M MO ASP A BSSGSC S B N SC 

B NO LPV L APTIAR T A I TR 

0 LE UBU U NlNACF X N A NF 

2 11 122 1 221111 1 1 4 11 

// / /// / 

AGGGGGCGAACACCCTCTCCGTGTCTCAGCTCGAGCTCCAGGATAGTGGCACCTCGACAT 

601 • 660 

TCCCCCCCTTCTCGGAGACGCACAGACTCGACCTCGACGTCCTATCACCGTGGACCTGTA 

CGKTLSVSQLELQDSCTWTC- 

N 

NS M NV A 

L p B HA L 

AH 0 EE U 

31 2 11 1 

GCAC1GTCTTGCAGAACCACAAGAAGGTGGAGTTCAAAATACACATCGTGCTCCTAGCTT 

661 • ™ 

CGTGACAGAACGTCnGGTCnCTTCCACCTCAAGTTTTATCTGTACCACCACGATCCAA 

TVLQNQKKVEFKID1VVLAF- 

HS MM 
AT N N 

EU L L 

31 1 1 

/ I 

TCCAGAACCCCTCCAGCATAGTCTATAAGAAACACGCGCAACACGTCCACnCTCCTTCC 

721 * ♦ 780 

AGGTCnCCGGAGGTCGTATCACATAnCTTTCTCCCCCTTGTCCACCTCAAGACGAAGC 

QKASSIVYKKECEQVEFSFP- 

A AM 

L L N 

U U L 

1 1 1 

CACTCCCCTTTACAGnCAAAACCTGACCGGCAGTGGCGACCTCTGCTCCCAGGCGCAGA 

7fll ♦ ♦ 840 

CTGAGCCGAAATCTCAACTTTTCGACTGCCCGTCACCGCTCGACACCACCCTCCGCCTCT 

L aftveklt gsgel w wqae r - 
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P 5 

K 1/ Fl/ A M 
F N LN U B 
K I VI 3 0 
] 1 11 A 2 
CGGCTTCCTCCTCCAAClCnGGATCACCTTTCACCTCAAGAACAACCAAGTGTCTCTAA 

841 — * 

CCCCAAGGAGGAGGTTCAGAACCTAGTGGAAACTGGACT7CTTGTTCCTTCACAGACATT 



ASSSKSWITFDLKNKEVSVK- 
6 BS PS 

SW SCADNPAD A AH 
TA TRVRLUUD L LP 
EE NFAAAW9E U U H 

23 11224161 1 1 1 

AACGGGTTACCCAGCACCCTAAGCTCCACATGGGCAACAAGCTCCCCCTCCACCTCACCC 

901 — — , , --♦ . 960 

TTGCCCAATCGGTCCTCCGAnCCAGCTCTACCCGncnCCAGGGCGAGCTCGAGTGGC 

RVTQDPKLQMCKKLPlHLTl- 

BS BSS 

M SC HS D M H SCAHM 

N TR AT D N P TRUAN 

L NF EU E L H NF5EL 

1 11 31 1 11 11631 

/ / / / 

TGCCCCACGCCrrCCCTCAGTATCCTGCCTCTGCAAACCTCACCCTCCCCCTTGAACCGA 

961 

ACCCGCTCCCCAACCGACTCATACGACCCAGACCTnCCAGTCCCACCCCCAACrrCCCT 

PQALPQYACSGNITIALEAK- 

S BS 

F SC H D A 

A TR f POL 

N NF H E U 

1 11 111 

/ 

AAACACCAAAGT7CCATCACCAACTGAACCTCCTCGTCATCACACCCACTCACCTCCACA 

1021 •» * --♦ ♦ * 1080 

TTrCTCCTTTCAACGTAGTCCTTCACnCGACCACCACTACTCTCGGTGAGTCGACCTCT 

TCKLHQEVNLVVMRATQLQK- 
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PS S 

M ADNNPA DF AM DE A 

N VRLLUU DA LN DS L 

, AAAAM9 EN UL EP U 

l 224416- 11 11 11 1 

AAAATTTCACCTGTGAGGTCTGGGCACCCACCTCCCCTAAGCTGATGCTCAGCTTGAAAC 
1081 mArACTGGACACrcCACACCCCTCGGTGGAGCGGAnCCACTACGACTCGAACrrrG 

NLTCEVWCPTSPKLMLSLKL- 

I A P N DS 

I 5 A L ET 

1 1 2 1 12 

TGGAGAACAAGGAGGCAAAGGTCTCGAAGCGGGAGAAGCCGGTGTGGGTGCTGAACCCTG 

1141 ACCTCnGnCCTCCG^CCAGAGCn 

ENKEAKVSKREKPVWVLNPE- 

H PS H 

F D M I A ADPA I 

0 0 A N V VRO) N 
K E E F A AAV 9 F 

1 13 11 2216 1 

ACGCCCGGATCTGGCAG7GTCTGCTGACTGACTCGCCACAGGTCCTGCTCGAATCCAACA 
1201 TCCCCCCCTACACCGTCACAGACGA^ 



A G 



MWQCLLSDSGQVLLESNI 



5 SA BHF BS B 

ANA HNCP SCWAANXA SH 

VLU PCRA PIUNMJLHV * pp 

AA9 AIFl 1ADLH3A0A 1H 

236 2111 21211A421 21 

TCAAGGnCTCCCCACATGGTCCACCCCCGTGCACGCGGATCCCCAGGCTCAGTGTGCCC 

1261 iScCAAGACGGGTGTACCACGTW 

KVLPT WSTPVHADPE 
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BS S S S 

MF SC F DHNA HNC A W M 

AO TR A RALU PCR .FAB 

£K NF N AEA9 AIF LEO 

11 11 1 2346 211 3 2 2 

TACACTACCCTCCMCCA^^ 13 8C 

1321 ircrcArcrGArcrAccrcrcrcrcrcrcircGGrcrArGA'crcrccAciTGGAGCTAGA 

BS S MS 

y D M SC M ANA M u T 

N D N TR B VLU B f Y 

, c l NF 0 AA9 0 , , 

\ \ i 11 2 246 2 1 1 

CnCCTCACCACCTGAACTCCTGGGCCGACCGTCAGTCTlCCTCTlCCCCCa ^ 

1351 CAAGGAGTCGTGGACTIGAGGA^ 

APELLC GPSVFLFPPKPK- 

S SS N 

AN M HMANNAC DM M NS M 

UL N PNVCLUR DS A LP * 

3A L ALAIA9F ET E AH E 

A3 1 2121461 12 3 31 2 

AGCACACCCTCATCATCTCCCCCACCCCTCACG^ UQ0 
1441 ICOCTCGCAGTAO AGAGCGCC^ 

D T L M I S R T P E V T C V V V D V S H - 

N DS B SA N 

L ET 0 AE L 

I 12 2 12 1 f 

ACCAAGACCCTGAGGTCAACnCAACTGCTACGTCGACGGCGTCCAGCK ^ 

n ° l tgcttctgggactccagt^caagttg^ 

edpevkf nwyvdgvevhnak- 



F FN 



s 



M N NSS R M R HNC HH 

N U UPA S AS PCR CP 

L 4 DBC A E A AIF AH 

1 H 222 1 2 1 211 11 



acacaaagccgcgggaggagcagtacaacagcacgtaccgggtcctcagcgtcctcaccg 
TCTcmcGcccccaca^ 

T K P R E E Q Y N S T Y R V V S V L T V 
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M SC ; 

N TR J 
L NF * 

1 n i 

7CCTGCACCAGGACTCCCTCAATCCCAACCAGTACAAG7CCAACCTCTCCAACAAAGCCC 
1621 ^GrCGrGGrcCTCArCGAcfrACrG^CCKATGnCACGnCCACAGGnGTTrCGGG 

LHQDWLNGKEYKCKVSNKAL- 

P S S 

T ADNNPW.A A 

u A VRILUNU U 

L n AAAAML9 S 

•T , 2244116 6 

//// / 

TCCCAGCCCCCATCCAGAAAACCATCTCCAAAGCCAAAGGTGGGACCCGTGGGCTGCGAG ^ 

1681 AGGGKCCGGGTAGCTCTT^CGTA^ 

PAP1EKTISKAK 

S N 

H M N HHN BSAH 0 M M S R 

AN L APA GFUA 0 N A PS 

EL A EAE L19E E L E B A 

3 13 321 1163 113 2 1 

GCCCACATCGACAGAGGCCGGCTCCGCCCACCCTCTGCCCTCAGAGTGAaGCTGTACCA ^ 

1741 CCGGTGTACCTGTCTCCGGCCGAG^ 

F SS 

M N A B R F I AHNNCC 

N u V B SO VPCCRR 

L 4 J V A K AA11FF 

1 HI 1 11 121111 

1 //// 
ACCTCTGTCCTACACGGCAGCCCCCACAACCACAGGTGTACACCCTGCCCCCATCCCGCG ^ 

1801 TGGAGACAGGATGTCCCGTCCCCGO 

GQPREPQVYTLPPSRO- 
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BS BS B 

S A F SC SC S 

ML 0 TR TR P 

A U K NF NF M 

11 1 11 » 1 

ATCACCTCACCAACAACCAGCTCACCCTGACCTCCCTGCTCAAACGCTTCTATCCCAGCG 

1861 ♦ * ♦ ♦ 1520 

TACTCGACTGGTTCTTCCTCCAGTCGCACTGGACGGACCAGTTTCCCAAGATACGCTCCC 

ELTKNQVSLTCLVKCFYPSD- 

F 

N H B 
UP B 
4 A V 
H 2 1 

ACATCCCCCTGCAGTGGCACAGCAATCCCCACCCCCACAACAACTACAACACCACCCCTC 
1921 TCTAGCCCCACcicACCCTCTCGnACCCGTCCGCCTCTTCnGATCnCTCGTCCGGAC 

I AVE*£SNGQPENNYKTTPP- 



H 



B 

it 1 V N H MA S 

N N B L P N L P 

L F 0 A H L U V 

112 4 1 11 1 

CCGTGCTGGACTCCGACGCCTCCTTCTTCCTCTACAGCAAGCTCACCCTCGACAACAGCA 

1S 6i * 2040 

CGCACCACCTCACCC7CCCCACCAAGAAGCACATCTCCT7CCAGTCGCACC7CT7CTCGT 

VLDSOCSFFLYSKLTVOKSR- 

F S 

NM VEX NF M N N 

UB ABM LA N S L 

40 EVN AN L I A 

H2 211 31 1 1 f 

GGTGCCAGCAGGGGAACCTCTrCTCATGCTCCGTGATGCATCAGGCTCTGCACAACCACT 

2041 CCACCGTCGTCCCCnGCACAAGAGTACCAGGCACTACGTACTCCCAC 

WQQGNVFSCSVMHEALHNHY- 

S 

M M HNC CXH 

B N PCR FMA 

0 L AIF RAE 

2 1 211 i33 

/ / 
ACACCCAGAAGACCCTCTCCCTG7C7CCCGGTAAATGACTGCCACCGCCG 

.... # ... . ♦ 2150 

7GTGCGTCTTC7CCCACACCGACAGACGCCCATTTACTCACCCTGCCCGC 

TQKSLSLSPCK- 
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Example 2: Preparation of the Fusion Proteins from Supernatants of COS Cells 



COS cells grown in DME medium supplemented with 10% Calf Serum and gentamicin sulfate at 15 
uo/ml were split into DME medium containing 10% NuSerum (Collaborative Research) and gentamicin to 
give 50% confluence the day before transfection. The next day, CsCI purified plasmid DNA was added to a 
final concentration of 0.1 to 2.0 ug/ml followed by DEAE Dextran to 400 ug/ml and chioroquine to 100 uM. 
After 4 hours at 37 # C, the medium was aspirated and a 10% solution of dimethyl sulfoxide in phosphate 
buffered saline was added for 2 minutes, aspirated, and replaced with DME'10% Calf Serum. 8 to 24 hours 
later, the cells were trypsinized and split 1 :2. 

For radiolabeling, the medium was aspirated 40 to 48 hours after transfection, the ceils washed once 
with phosphate buffered saline, and DME medium lacking cysteine or methionine was added. 30 minutes 
75 later. 25 S-labeled cysteine and methionine were added to final concentrations of 30-60 uci and 100-200 uci 
respectively, and the cells allowed to incorporate label for 8 to 24 more hours. The supernatants were 
recovered and examined by electrophoresis on 7.5% polyacrylamide gels following denaturation and 
reduction, or on 5% polyacrylamide following denaturation without reduction. The CD4B 7 1 protein gave the 
same molecular mass with or without reduction, while the CD4E 7 1 and CD4H 7 1 fusion proteins showed 
20 molecular masses without reduction of twice the mass observed with reduction, indicating that they formed 
dimer structures. The CD4 IgM fusion proteins formed large multimers beyond the resolution of the gel 
system without reduction, and monomers of the expected molecular mass with reduction. 

Unlabeled proteins were prepared by allowing the cells to grow for 5 to 10 days post transfection in 
DME medium containing 5% NuSerum and gentamicin as above. The supernatants were harvested, 
25 centrifuged. and purified by batch adsorption to either protein A trisacryl, protein A agarose, goat anti- 
human IgG antibody agarose, rabbit anti-human IgM antibody agarose, or monoclonal anti-CD4 antibody 
agarose. Antibody agarose conjugates were prepared by coupling purified antibodies to cyanogen bromide 
activated agarose according to the manufacturer's recommendations, and using an antibody concentration 
of 1 mg/ml. Following batch adsorption by shaking overnight on a rotary table, the beads were harvested by 
30 pouring into a sintered glass funnel and washed a few times on the funnel with phosphate buffered saline 
containing 1% Nonidet P40 detergent. The beads were removed from the funnel and poured into a small 
disposable plastic column (Quik-Sep QS-Q column. Isolab), washed with at least 20 column volumes of 
phosphate buffered saline containing 1% Nonidet P40. with 5 volumes of 0.15 N NaCI. 1 mM EDTA (pH 
8.0). and eluted by the addition of either 0.1 M acetic acid. 0.1 M acetic acid containing 0.1 M NaCI. or 0.25 
35 M glycine-HCI buffer. pH 2.5. 



Example -3: Blockage' of Syncytium Formation by the Fusion Proteins | 

40 Purified or partially purified fusion proteins were added to HPB-ALL cells infected 12 hours previously 
with a vaccinia virus recombinant encoding HIV envelope protein. After incubation for 6-8 more hours, the 
cells were washed with phosphate buffered saline, fixed with formaldehyde, and photographed. All of the 
full-length CD4 immunoglobulin fusion proteins showed inhibition of syncytium formation at a concentration 
of 20 ug/ml with the exception of the 4H 7 1 protein, which was tested only at 5 ug/ml and showed partial 

45 inhibition of syncytium formation under the same conditions. 



Example 4: Chromium Release Cytolysis Assay 



The purified fusion proteins were examined for ability to fix complement in a chromium release assay 
using vaccinia virus infected cells as a model system. Namalwa (B cell) or HPB-ALL (T cell) lines were 
infected with vaccinia virus encoding HIV envelope protein, and 18 hours later were radiolabeled by 
incubation in 1 mci/ml sodium 5 'chromate in phosphate buffered saline for 1 hour at 37". The labeled cells 
were centrifuged to remove the unincorporated chromate. and incubated in microtiter wells with serial 
dilutions of the CD4 immunoglobulin fusion proteins and rabbit complement at a final concentration of 40%. 
After 1 hour at 37" . the cells were mixed well, centrifuged, and the supernatants counted in a gamma-ray 
counter. No specific release could be convincingly documented. 
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Examoie 5: Binding of the CD4E^ Protein to Fc Receptors 

Purified CD4E 7 1 fusion protein was tested for its ability to displace radiolabeled human IgGi from 
human Fc receptors expressed on COS cells in culture. The IgGi was radiolabeled with soa.um ioc.de 
u*ing 1 mci of iodide. 100 ug of IgGi. and Two ioobeads (Pierce). The labeled protein was separated from 
unincorporated counts by passage over a -Sephaoex G25 column equilibrated with phosphate buffered 
saline containing 0.5 mM EDTA and 5% nonfat milk. Serial dilutions of the C04E 7 1 fusion protein or 
unlabeled IgGi were prepared and mixed with a constant amount of radiolabeled IgGi tracer. After 
incubation with COS cells bearing the FcRl and RcRH receptors at 4' C for at least 45 minutes in a volume 
i of 20 ul 200 ul of a 3:2 mixture of dibutyl to dioctyl phthalates were added, and the cells separated from 
the unbound label by centrifugation in a microcentrifuge for 1 5 to 30 seconds. The tubes were cut with 
scissors, and the cell pellets counted in a gamma-ray counter. The affinity of the CD4E 7 1 protein for 
receptors was measured in parallel with the affinity of the authentic IgGi protein, and was found to be the 
same, within experimental error. 

5 

Example 6: Stable Expression of the Fusion Construct pCD4E-yt in Baby Hamster Kidney Ceils 

Twenty-four hours before transfection, 0.5 x 10 6 baby hamster kidney cells CBHK; ATCC CCL10) were 
20 seeded in a 25 cm 2 culture flask in Dulbecco's modified Eagle's medium (DMEN) containing 10% of fetal 
calf serum (FCS). The cells were cotransfected with a mixture of the plasmids pCD4E>1 (20 ug). pSV2dhfr 
(5 ug; Lee et al.. Nature 294:228-232 (1981)) and pRMH140 (5 ug. Hudziak et at.. Cell 31^:137-146 (1982)) 
according toa"modified calcium phosphate transfection technique as described in Zettlmeissl et al ( Behring 
Inst Res. Comm. 82:26-34 (1988)). 72 h post-transfection. cells were split 1:3 to 1:4 (60 mm culture aishes) 
25 ana 1 r"eTistant colons were selected in DMEM medium containing 10% FCS. 400 ug/ml G418 (Geneticin. 
Gibco) and 1 uM methotrexate (selection medium). The medium was changed twice a week. The resistant 
colonies (40-100/transfection) appeared 10-15 day post-transfection and were further propagated either as a 
mixture of clones (i.e., BHK-NK1 ) or as individually isolated clones. For the determination of the relative 
expression levels, clone mixtures or individual clones were grown to confluency in T25 culture flasks. 
30 washed twice with protein-free DMEM medium, and incubated for 24 h with 5 ml protein-free DMEM 
medium. These media were collected and subjected to a human IgG specific ELISA in order to determine 
the relative expression levels of the CD4-lgGl fusion protein CD4E 7 1. For further analysis an individual 
clone (BHK-UC3) was chosen due to its high relative expression levels. 

Example 7: Detection of the CD4E 7 1 Protein in Culture Supernatants 

For 35 S methionine labeling of cells, the" done BHK-UC3 and ustransfected BHK cells (control) were 
grown to confluency in T25 culture flasks and subsequently incubafed for two hours in HamFl2 medium 

40 without methionine. Labeling was achieved by incubating 24 h in 2.5 ml of the same medium containing 100 
uCi 3 *S methionine (1070 Ci/mmole. Amersham). For the preparation of cell ly sates, the labeled cells were 
harvested in 1 ml of phosphate buffered saline. pH 7.2 (PBS) and lysed by repetitive freezing and thawing. 
Cleared lysates (after centrifugation 20000 rpm. 20 min) and culture supernatants were incubated with 
Protein A-Sepharose (Pharmacia) and the bound material was analyzed on a 10% SDS-Protein A-Sepharose 

45 (Pharmacia) and the bound material was analyzed on a 10% SDS-gel according to Laemmli ( Nature 
227:680-685 (1970)), which was subsequently autoradiographed. A specific band of about 80 KDa can be 
delected only in the' supernatant of clone BHK-UC3, which is absent in the lysate of clone BHK-UC3 and in 
the respective controls. 



35 
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Example 8: Purification of the Protein CD4E 7 1 from Culture Supernatants 

in order to demonstrate that the fusion protein coded by the plasmid pCD4E 7 1 can be obtained in high 
quantities, the clone BHK-UC3 was grown in 1750 cm 2 roller bottles in selection medium (500 ml). Confluent 
monolayers were washed twice with protein-free DMEM medium (200 ml) and further incubated for 48 h 
with protein-free DMEM medium (500 ml). The conditioned culture supernatants (1-2 I) and respective 
supernatants from untransfected BHK cells were cleared by centrifugation (9000 rpm. 30 min) and 
microfiltered through a 0.45 urn membrane (Nalgene). After addition of 1% (v/v) of 1 .9 M Tris-HCI buffer. 
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pH 8.6, the conditioned medium was absorbed to a Protein A-Sepharose column equilibrated with 50 mM 
Tris-HCI pH 8.6 buffer containing 150 mN NaCI (4 - C). The loaded column was washed with 10 column 
volumes of equilibration buffer. Elution of the CD4-lgGl fusion protein CD4E 7 1 was achieved with 0.1 M 
sodium citrate buffer, pH 3, followed by immediate neutralization of the column efflux to pH 8 by Tris-base. 

s The peak fractions were pooled, and the pool was analyzed on a Coomassie blue stained SDS-gel resulting 
in a band of the expected size (80 KDa), and which reacted with a polyclonal anti-human IgG heavy chain 
antibody and a mouse monoclonal ant»-CD4 anti body (BMA040, Behringwerke) in Western Blots. The 
yields of purified fusion proteins obtained by the given procedure is 5-18 mg/24 h/l culture supernatant. The 
respective value for a BHK clone mixture (about 80 resistant clones; BHK-NK1) as described above was 2-3 

;o mg/24 h/l. 
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Example 9: Physical and Biological Characterization of the CD4E 7 1 Fusion Protein 

As proven by SDS-eiectrophoresis on 10-15% gradient gels (Phast-System, Pharmacia) under non- 
reductive conditions, the CD4E 7 1 fusion protein migrates at the position of a homodimer (about 160 KDa) 
like a non-reduced mouse monoclonal antibody. This result is supported by analytical equilibrium ultracen- 
trifugation, where the fusion protein behaves as a homogeneous dimeric molecule of about 150 KDa. The 
absorbance coefficient of the protein was determined as A 28 o - 18 cnvVmg using the quantitative protein 
determination according to Bradford ( Anal. Biochem. 72:248-254 (1976)). 

The CD4E 7 1-fusion protein shows specific complex formation with a solubilized £gal-gpi20 fusion 
protein (pMB1790; Broker et aJ., Behring Inst. Res. Commun. 82:338-348 (1988)) expressed in E. coli. In this 
protein (110 KDa). a major P art of the HIV gpl20 protein (VaUs-Trp***) is fused to 0-galactosidase (amino 
acids 1-375). In a control experiment a 67-KDa 0-gal-HIV 3orf fusion protein (jSgah-375; 3 orf Prol4- 
Aspl23) showed no complex formation. En these experiments, the CD4E 7 1 -protein was incubated with the 
respective fusion protein in molar rations of about 5:1. The complex was isolated by binding to Protein A- 
Sepharose and the Protein A-Sepharose bound proteins-together with relevant controls-were analyzed on 
10-15% gradient SDS-gels (Phast-System. Pharmacia). 

The CD4E 7 1 fusion protein binds to the surface of HIV (HIV1/HTLV-IIIB) infected cultured T4- 
lymphocytes as determined by direct immunofluorescence with fluorescein-isothiocyanate (FITC) labeled 
CD4E 7 1 protein. It blocks syncytia formation in cultured T4-!ymphocytes upon HIV infection (0.25 TCID/cell) 
at a concentration of 10 ug/ml. Furthermore. HIV-infected cultured T4-lymphocytes (subclone of cell line 
H9) are selectively killed upon incubation with CD4E 7 1 in the presence or absence of complement: To a 
highly (>50%) HIV infected culture of T4-lymphocytes (10* cells/ml) 50. 10 or 1 ug/ml CD4E 7 1 fusion 
protein was added in the presence or absence of guinea pig complement. Cells were observed for specific 
killing by the fusion protein, which is defined by the percentage of killed cells after 3 days in relation to 
viable cells in the culture at the beginning of the experiment corrected by the values for unspecific killing 
observed in control cultures, lacking the CD4E 7 1 fusion protein (Table j. Experiment I). Surprisingly, 
addition of CD4E 7 1 protein to the infected T4 cells in the absence of complement resulted in similar 
specific killing rates as in the presence of complement (Table 5, Experiment II). This result demonstrates a 
complement independent cytolytic effect of CD4Eyl on HIV infected T-lymphocytes in culture. 

Table 5 



45 


No. 
Experiment 


Assay System 


Specific 
Killing (%) 




1 


non-infected T4-cells + 50 ug/ml CD4E 7 I + Compl. 


0.7 






infected T4-cells + 50 ug/ml CD4E 7 I + Compl. 


35.1 


50 




infected T4-cells + 10 ug/ml CD4E 7 l + Compl. 


25.1 




infected T4-cells + 1 ug/ml CD4E 7 l + Compl. 


25 




II 


infected T4-celis + 10 ug/ml CD4E 7 l + Compl. 


49.9 






infected T4-cells + 10 ug/ml CD4E 7 I + Compl. 


69.4 



55 Having now fully described this invention, it will be appreciated by those skilled in the art that the same 
can be performed with any wide range of equivalent parameters of composition, conditions, and methods of 
preparing such fusion proteins without departing from the spirit or scope of the invention or any 
embodiment thereof. 
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Claims 

1. A fusion protein gene comprising 1) the DNA sequence of CD4, or fragment thereof which binds to 
HIV gpl20. and 2) the DNA sequence of an immunoglobulin heavy chain, characterized in that the DNA 

5 sequence which encodes the variable region of said immunoglobulin chain has been replaced with the DNA 
sequence which encodes CD4, or said gpi2Q binding fragment thereof. 

2. The fusion protein gene of claim 1, wherein the DNA sequence which encodes said fragment of CD4 
comprises the following DNA sequence: 



75 
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CAA7CAACCGGG 

120 

GTTACTTGGCCC 



CAC7CCCTT77AGGCACT7CC77C7CCTCC7GCAACTCCCGCTCCTCCCAGCACCCAC7C 

C7CACCGAAAA7CCG7CAACGAACACCACGACG77CACCCCCACGACGG7CC7CCG7GAG 

ACGGAAAGAAAC7CGTCCTGCGCAAAAAACCGCATACACTCGAACTGACCTCTACACC77 

7CCC777C777CACCACGACCCG777777CCCC7A7G7CACC77CAC7CGACA7G7CCAA 

CCCAGAAGAAGAGCA7ACAA77CCAC7GCAAAAAC7CCAACCACA7AAACA77C7CGGAA 
+ + + + ^ 

CGC7C77C77C7CG7A7G77AAGG7GACC77777GAGG77CG7C7A777C7AAGACCC77 
A7CACGGC7CC77C77AAC7AAAGC7CCA7CCAAGC7CAA7GA7CGCGC7GAC7CAAGAA 
7AG7CCCGAGGAAGAA77GA777CCAGG7AGG77CGAC77AC7AGCGCGAC7CAG77C77 
GAACCC777GGGACCAAGGAAAC77CCCCC7CA7CA7CAAGAA7C77AAGA7ACAAGAC7 
C77CCGAAACCC7CC77CC777CAAGCGGGAC7AC7AG77Ct4aCAA77C7A7C77C7CA 



CACA7AC77ACA7C7G7GAAC7GCAGGACCACAAGCACGAGC7GCAA77GC7AC7C77CG 

421 ♦ + ♦ ♦ ♦ 480 

4S C7C7A7GAA7C7AGACAC77CACC7CC7GG7C77CC7CC7CCACG77AACGA7CACAAGC 

CA77CAC7GCCAAC7C7GACACCCACC7CC77C 

481 ♦ * 

50 C7AAC7GACCG77GAGAC7G7GGG7GCACGAAG 

or a degenerate variant thereof, or the following DNA sequence: 

55 
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CAA7CAACCGGC 

120 

CTTACTTGGCCC 



CAC7CCC7777AGGCAC77CC77C7GG7CC7GCAAC7GGCGC7CC7CCCAGCAGCCAC7C 

«, «. ♦ ♦ ♦ 1 

C7CACCGAAAA7CCC7GAACGAAGACCACGACG77GACCCCGAGGAGGG7CC7CCG7CAC 

AGGGAAAGAAAGTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTT 

7CCC777C777CACCACGACCCG777777CCCC7A7C7CACC77GAC7GGACA7C7CCAA 

CCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGCAA 

GCC7C77C77C7CC7A7C77AAGG7CACC77777GAGG77GG7C7A777C7AAGACCC77 

ATCAGGGCTCCTTCTTAACTAAACGTCCATCCAAGCTGAATCATCGCGCTGACTCAAGAA 

7AG7CCCCAGGAAGAA77GA777CCAGG7AGG77CCAC77AC7AGCGCCAC7GAG77C77 

GAAGCCTTTGGGACCAAGGAAAC7TCCCCCTGATCATCAAGAATCTTAAGATAGAAGACT 

CTTCCGAAACCCTGGTTCCTTTGAAGGGGGACTAGTAGTTCTTAGAATTCTATCTTCTGA 

CAGA7AC77ACA7C7G7CAAC7GCAGGACCAGAACGACGAGG7GCAA77GC7AC7G7TCG 

1 * * * * * 

C7C7A7CAA7G7AGACAC77CACC7CC7GC7C77CC7CC7CCACC77AACGA7CACAAGC 



GA77CAC7CCCAAC7C7CACACCCACC7CC77CACCCGCAGAGCC7CACCC7GACC77GG 

481 * * * 540 

C7AAC7CACGG77GACAC7C7CCG7GGACGAAC7CCCCG7C7CGGAC7CCGAC7GGAACC 



AGACCCCCCC7GG7AC7AGCCCC7CAG7GCAA7C7AGGAG7CCAACGGG7AAAAACA7AC 

541 + ♦ * * * 600 

7C7CCGGGGGACCA7CA7CGCGGAGTCACG77ACA7CC7CAGG77CCCCA77777G7A7G 

AGGGCGGGAAGACCC7C7CCG7G7CTCAG 

601 ♦ ♦ 

7CCCCCCC77C7GCGAGAGGCACAGAG7C 

or a degenerate variant thereof. 

3. The fusion protein gene of claim 1 or 2. characterized in that said immunoglobulin chain is of the 

class igM, IgGi or lgG3. 
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4. A fusion protein gene comprising 1) the DNA sequence of CD4. or. fragment thereof which binds to - 
HIV gpi20. and 2) the DNA sequence of an immunoglobulin light chain, characterized :n that the DNA. 
seauence which encodes the variable region of said immunoglobulin light chain has been replaced with the 
DNA sequence which encodes CD4. or HIV gpt20-binding fragment thereof. 

5. A fusion protein gene of claim 4, characterized in that the DNA sequence which encodes said 
fragment of CD4 comprises the following DNA sequence: 

CAATGAACCGGG 

120 

G 77 AC 77 GGCCC 

GAC7CCC7777AGGCAC77GC77C7GG7GCTCCAAC7CGCGCTCCTCCCAGCAGCCAC7C 



121 * "0 

^ C7CAGGGAAAA7CCC7GAACGAAGACCACGACC77CACCCCGACGAGGG7CG7GGG7GAC 

20 

AGCGAAAGAAAC7GG7GC7GGGCAAAAAAGGGGA7ACAG7GGAAC7GACC7G7ACACC77 

181 * ♦ * * * HO 

7CCC777C777CACCACGACCCG77T777CGGC7A7C7CACC77GAC7GGACA7G7CGAA 

25 

CCC AGA AG AAG AGC A7AC A AT7CCAC7CGA A A AACTCC A ACC AG ATA A AGA7TC7GGG A A 

241 ♦ ♦ + * 300 

GGG7C77C7TCTCGTA7G7TAAGGTGACCT77TTGAGG7TGGTC7A7T7C7AAGACCC7T 

30 

ATCAGCGC7CC7TC77AAC7AAAGCTCCATCCAAGCTGAATGATCCCGC7GACTCAAGAA 

301 * ♦ ♦ ♦ ♦ 360 

TAC7CCCCAGGAACAATTCA777CCAGGTAGC77CCAC77AC7ACCGCGAC7GAG77C77 

35 GAAGCC77TCGGACCAAGGAAAC7TCCCCCTCA7CATCAAGAATCT7AAGATAGAACACT 

361 ♦ * * * 420 

C77CGGAAACCC7CG77CCT77CAACGGCGAC7AC7ACnC|rAGAATTCTA7C7TCTCA 

40 CAGATACT7ACATCTGTGAAG7GGAGGACCAGAACGAGCAGG7GCAA77GC7AGTG7TCG 

421 ♦ * ♦ - - + ♦ 480 

GTC7ATGAA7GTACACAC77CACC7CC7GGTC77CC7CCTCCACG77AACGA7CACAAGC 



CA7TGACTGCCAACTCTGACACCCACCTGCTTC 

481 ♦ + 

C7AACTGACCG7TGAGACTC7GGGTCGACGAAG 

or a degenerate variant thereof, or the following DNA sequence: 
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CAA7CAACCGGG 

120 

CTTACTTCGCCC 



GAG7CCC7777ACGCAC77CC77C7GG7CC7GCAAC7GGCGC7CC7CCCAGCACCCAC7C 
C7CACCGAAAA7CCG7CAACCAAGACCACGACC77GACCGCCAGCAGGC7CC7CCG7CAC 
AGGCAAAGAAAG7GG7GC7GGCCAAAAAAGGGGATACAGTGGAACTGACCTG7ACAGC77 
TCCC777C777CACCACGACCCG7T7777CCCC7A7G7CACC77GAC7GGACA7G7CCAA 



CCCAGAAGAACAGCA7ACAA77CCAC7GCAAAAAC7CCAACCACA7AAACA77C7GGCAA 

241 3 

GGC7CTTCT7CTCCTATGTTAAGGTGACCTTTTTGAGGTTGGTCTATTTCTAAGACCCTT 

A7CACGGC7CC77C77AAC7AAAGG7CCA7CCAAGC7CAA7CA7CCCGC7GAC7CAAGAA 

7AG7CCCGACGAAGAA77GA777CCAGG7AGG77CCAC77AC7AGCCCGAC7GAG77C77 

GAACCC777GGGACCAACGAAAC77CCCCC7CA7CA7CAAGAA7C77AACA7ACAACAC7 

C77CCGAAACCC7GG77CCTT7CAACCGGGAC7AG7AGT7CTTAGAA77C7A7C77C7GA 

CAGA7AC77ACA7C7G7GAAG7GGAGCACCAGAAGGAGGAGG7GCAA77GC7AG7G77CG 

G7C7A7CAA7C7AGACAC77CACC7CC7GG7C77CC7CC7CCACC77AACGA7CACAACC 

CAT7CAC7CCCAAC7C7GACACCCACC7GCT7CAGGGGCACACCC7IACCC7CACC77GG 

C7AAC7GACGG77GAGAC7C7GGG7GCACGAAG7CCCCG7C7CGGAC7GGGAC7CGAACC 

AGACCCCCCC7GG7AG7AGCCCC7CAG7GCAA7G7AGGAG7CCAAGCGG7AAAAACA7AC 

541 ♦ * * * * 

7C7CCGGGGGACCA7CA7CGGCGAG7CACG77ACA7CC7CAGG77CCCCA77777G7A7G 

AGGCCGGGAAGACCC7C7CCG7G7C7CAG 

601 + * 

7CCCCCCC77C7GGGAGACGCACAGAG7C 

or a degenerate variant thereof. 
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6 A vector comprising the fusion protein gene of claim 1. preferably having the identifying characteris- 
tics of pCD4H 7 1. which has been deposited under Accession No. 67611. or pCD4Mu. which has been - 
deposited under Accession No. 67608, or of pCD4Pu. which has been deposited under Access.on No. 
67609. or of pCD4E 7 l. which has been deposited under Accession No. 67610, all in E. coli at ihe ATCC 
under the terms of the Budapest Treaty. 

7. A vector comprising the fusion protein gene of claim 4. 

8 A host transformed with the vector of claim 6 or 7. 

9^ The host of claim 8 which expresses an immunoglobulin light chain together with the exoression 
product of said fusion protein gene to give an immunoglobulin-like molecule which binds to gpi20 or an 
immunoglobulin heavy chain together with the expression product of said fusion protein gene to give an 
immunoQlobulin-like molecule which binds to HIV or SIV gp120. 

10. The host of claim 9. wherein said immunoglobulin heavy chain is of the immunoglobulin class IgM. 

IgGl or lgG3. 

11. A method of producing a fusion protein comprising CD4, or fragment thereof wmch binds to gp120. 
and immunoglobulin heavy chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gpi20. characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 6, 
said vector further comprising expression signals which are recognized by said host strain and direct 
expression of said fusion protein, and recovering the fusion protein so produced. 

12. The method of claim 11. wherein said host strain is a myeloma cell line which produces 
immunoglobulin light chains and said fusion protein comprises an immunoglobulin heavy chain of the class 
IgM. IgGl or lgG3. wherein an immunoglobulin-like molecule comprising said fusion protein is produced. 

13. A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gpi20. 
and an immunoglobulin light chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gpi20, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 7, 
said vector further comprising expression signals which are recognized by said host strain and direct 
expression of said fusion protein, and recovering the fusion protein so produced. 

14. The method of claim 13. wherein said host produces immunoglobulin heavy chains of the class 
IgM. IgGl and !gG3 together with said fusion protein to give an immunoglobulin-like molecule which binds 
to HIV-gpl20. 

15. A fusion protein, which is preferably detectably labeled, comprising CD4, or fragment thereof which 
is capable of binding to HIV or SIV gpl20, fused at the C-terminus to a second protein which comprises an 
immunoglobulin heavy chain of the class IgM. IgGl or lgG3, wherein the variable region of said heavy chain 
immunoglobulin has been replaced with CD4. or HIV gpl20-binding fragment thereof, and preferably further 
comprising a therapeutic agent, radiolabel or NMR imaging agent linked to said fusion protein. 

16. The fusion proteins CD4H 7 1, CD4Mu, CD4Pu, CD4E 7 1 or CD4B 7 1. 

17. An immunoglobulin-like molecule, comprising the fusion protevi of claim 15 and an immunoglobulin 
light chain, preferably further comprising a detectable label, and especially further comprising a therapeutic 
agent, radiolabel or NMR imaging agent linked to said immunoglobulin-like molecule. 

18. A fusion protein comprising CD4. or fragment thereof which binds to HIV gp120. fused at the C- 
terminus to a second protein comprising an immunoglobulin light chain where the variable region has been 
deleted, and which fusion protein preferably is detectably labeled, especially further comprising a therapeu- 
tic agent, radiolabel or NMR imaging agent linked to said fusion protein. 

19. The fusion protein of claim 15. wherein said CD4 fragment comprises the following amino acid 
sequence: 
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M N R G 

VPFRKLLLVLQLALLPAATQ 
GKKVVLGKKGDTVELTCTAS 
QKKSIQFHWKNSNQIKILGN 
QGSFLTKGPSKLNDRADSRR 
SLWDQGNFPLI IKNLKIEDS 
J o DTY I CEVEDQKEEVQLLVFG 

LTANSDTHLLQ 



75 or the following amino acid sequence: 

M N R C 

VPFRHLLLVLQLALLPAATQ 
20 GKKVVLGKKGDTVELTCTAS 
QKKSIQFHWKNSNQIKILGN 
QGSFLTKGPSKLNDRADSRR 
SLWDQGNFPLI IKNLKIEDS 
DTYICEVEDQKEEVQLLVFG 
LTANSDTHLLQGQSLTLTLE 
SPPGSSPSVQCRSPRGKNIQ 
GGKTLSVSQ 

20. An immunogiobuiin-like molecule comprising the fusion protein of claim 18 and an immunoglobulin 
heavy chain of the class IgM. IgGl or lgG3. preferably further comprising a detectable label, and especially 
further comprising a therapeutic agent, radiolabel or NMR imaging agent linked to said .mmunoglobul.n-hke 
molecule. 

21 A complex comprising the fusion protein of claim 15 or 18 and HIV or SIV gpl20. 
22. The complex of claim 21. wherein said gpi20 is a part of an Hf/ or SIV, is expressed on the 
suriace of an HIV or SIV-infected ceil or is present in solution. 

23 A method for the detection of HIV or SIV gpl20 in a sample, characterized by 

(a) contacting a sample suspected of containing HIV or SIV gpl20 with the fusion protein of claim 15 
or 18. and 

(b) detecting whether a complex is formed, said fusion protein preferably being detectably labeled. 
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Claims for the following Contracting State: GR 

1 A vector comprising a fusion protein gene comprising 1) the DNA sequence of CD4, or fragment 
50 thereof which binds to HIV gpi20, and 2) the DNA sequence of an .mmunoglobulin heavy chain, 
characterized in that the DNA sequence which encodes the variable region of said immunoglobulin chain 
has been replaced with the DNA sequence which encodes CD4, or said gpi20 binding fragment thereof. 

2. The vector of claim 1. having the identifying characteristics of pCD4H 7 L which has been deposited 
in E coli at the ATCC under the terms of the Budapest Treaty under Accession No. 67611. 
55 3. The vector of claim 1 . having the identifying characteristics of pCD4Ma. which has been deposited »n 
E coli at the ATCC under the terms of this Budapest Treaty under Accession No. 67608. 

4. The vector of claim 1, having the identifying characteristics of PCD4Pu, which has been deposited in 
I. coli at the ATCC under the Budapest Treaty under Accession No. 67609. 
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5. The vector of. claim 1, having the identifying characteristics of PC4E-y1, which has been deposited in - 
E. coii at the ATCC unoer the terms of the Budapest Treaty under Accession No. 67610. 

6. A vector comprising a fusion protein gene characterized by 1) the DNA sequence of CD4, or 
fragment thereof which binds to HIV gpl20, and 2) the DNA sequence of an immunoglobulin light chain. 

s wherein the DNA sequence which encodes the variable region of said immunoglobulin light chain has been 
replaced with the DNA sequence which encodes CD4. or HIV gpl20-binding fragment thereof. 

7. A host transformed with the vector of claim 1. 

8. The host of claim 7 which expresses an immunoglobulin light chain together with the expression 
product of said fusion protein gene to give an immunoglobuiin-like molecule which binds to gpl20. 

w 9. A host transformed with the vector of claim 6. 

10. The host of claim 6 which expresses an immunoglobulin heavy chain together with the expression 
product of said fusion protein gene to give an immunoglobuiin-like molecule which binds to HIV or SIV 
9P120. 

11. The host of claim 10. characterized in that said immunoglobulin heavy chain is of the im- 
75 munoglobulin class IgM, IgGl or lgG3. 

12. A method of producing a fusion protein comprising CD4. or fragment thereof which binds to gp120. 
and an immunoglobulin heavy chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gpl20, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 1, 

20 said vector further comprising expression signals which are recognized by said host strain and direct 
expression of said fusion protein, and recovering the fusion protein so produced. 

13. The method of claim 12. characterized in that said host strain is a myeloma cell line which produces 
immunoglobulin light chains and said fusion protein comprises an immunoglobulin heavy chain of the class 
IgM. IgGl or lgG3, wherein an immunoglobuiin-like molecule comprising said fusion protein is produced. 

25 14. A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gpl20. 
and an immunoglobulin light chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gpl20, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with the vector of claim 6, 
said vector further comprising expression signals which are recognized by said host strain and direct 

30 expression of said fusion protein, and recovering the fusion protein so produced. 

15. The method of claim 14, characterized in that said host produces immuno-globulin heavy chains of 
the class IgM, IgGl and lgG3 together with said fusion protein to give an immunoglobulin-like molecule 
which binds to HIV-gpl20. 

16. A method for the detection of HIV or SIV gpl20 in a sample, characterized by 

35 (a) contacting a sample suspected of containing HIV or SIV gpl20 with a fusion protein comprising 

CD4, or fragment thereof which binds to HIV gpi20, and 2) an immunoglobulin heavy chain, wherein the 
variable region of said immunoglobulin chain has been replaced with CD4, or said gp120 binding fragment 
thereof, and 

(b) detecting whether a complex is formed. I 

40 

17. The method of claim 16, characterized in that said fusion protein is detectably labeled. 

18. A method for the detection of HIV or SIV gpl20 in a sample, characterized by 

(a) contacting a sample suspected of containing HIV or SIV gpl20 with a fusion protein comprising 
comprising 1) CD4, or fragment thereof which binds to HIV gpl20, and 2) an immunoglobulin light chain, 

45 wherein the variable region of said immunoglobulin light chain has been replaced with CD4, or HIV gpl20- 
binding fragment thereof, and 

(b) detecting whether a complex has formed. 

19. The method of claim 18. characterized in that said fusion protein is detectably labeled. 

50 

Claims for the following Contracting State: ES 

1. A method of producing a fusion protein comprising CD4. or fragment thereof which binds to gpl20, 
55 and an immunoglobulin heavy chain, wherein the variable region of the immunoglobulin chain has been 
substituted with CD4, or fragment thereof which binds to HIV or SIV gpl20, characterized by cultivating in a 
nutrient medium under protein-producing conditions, a host strain transformed with a vector comprising a 
fusion protein gene comprising 1) the DNA sequence of CD4. or fragment thereof which binds to HIV 
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qpl20 and 2) the DNA sequence of an immunoglobulin heavy chain, wherein the DNA sequence which 
encodes the variable region of said immunoglobulin chain has been replaced with the DNA sequence wh.ch 
encodes CD4 or said gpl20 binding fragment thereof, said vector further comprising expression signals 
which are recognized by said host strain and direct expression of said fusion protein, and recovenng the 

fusion protein so produced. 

2 The method of claim 1, characterized in that said vector has the identifying characteristics of 
pCD4H 7 l. which has been deposited in E. coli at the ATCC under the terms of the Budapest Treaty under 

Accession No. 6761 1 . . ~i 

3 The method of claim 1. characterized in that said vector has the identifying characteristics of 
PCD4MU. which has been deposited in E. coli at the ATCC under the terms of this Budapest Treaty under 

Accession No. 67608. ., . , , . . , 

4 The method of claim 1, characterized in that said vector has the identifying characteristics of 
PCD4Pu. which has been deposited in E. coli at the ATCC under the Budapest Treaty under Accession No. 
67609. 

5 The method of claim 1. characterized in that said vector has the identifying character.st.es of 
pCD4E Y 1. which has been deposited in E. coli at the ATCC under the terms of the Budapest Treaty under 
Accession No. 67610. 

6 The method of claim 1. characterized in that said host strain is a myeloma cell line wh.ch produces 
immunoglobulin light chains and said fusion protein comprises an immunoglobulin heavy cha.n of the class 
IgM IgGl or lgG3. wherein an immunoglobulin-like molecule comprising said fusion prote.n .s produced: 

7 A method of producing a fusion protein comprising CD4, or fragment thereof which binds to gpl20. 
and an immunoglobulin light chain, wherein the variable region of the immunoglobulin cham has been 
substituted with CD4. or fragment thereof which binds to HIV or SIV gpl20. characterized by cultivating .n a 
nutrient medium under protein-producing conditions, a host strain transformed with a vector comprising a 
fusion protein gene comprising 1) the DNA sequence of CD4. or fragment thereof which binds to HIV 
gpl20 and 2) the DNA sequence of an immunoglobulin light chain, wherein the DNA sequence wh.ch 
encodes the variable region of said immunoglobulin light chain has been replaced with the DNA sequence 
which encodes CD4, or HIV gpl20-binding fragment thereof, said vector further compr.s.ng expression 
signals which are recognized by said host strain and direct expression of said fusion prote.n. and recovenng 

30 the fusion protein so produced. 

8. The method any one of claims 1 or 7. characterized in that the DNA sequence wh.ch encodes said 
fragment of CD4 comprises the following DNA sequence: 
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CAATGAACCGGG 

-+ 120 

5 GTTAC7TCGCCC 

CAG7CCC7777ACGCAC77GC77C7CG7GC7GCAAC7GGCGC7CCTCCCAGCAGCCAC7C 

121 - «► ♦ .+.-.. ISO 

jo CTCAGGGAAAATCCGTCAACGAAGACCACGACGTTGACCGCGAGGAGGG7CGTCGGTGAG 

ACCCAAAGAAAC7CG7CC7GGGCAAAAAAGGGGA7ACAC7GGAAC7GACC7G7ACAGC77 

181 - ♦ ♦ * ♦ 240 

75 7CCC777C777CACCACGACCCG777777CCCC7A7C7CACC77GAC7CGACA7G7CCAA 

CCCAGAAGAAGAGCA7ACAA77CCAC7GGAAAAAC7CCAACCAGA7AAAGA77C7GGGAA 

241 ♦ * ♦ - + ♦ 300 

GGG7C77C77C7CC7A7G77AAGC7GACC77777GACG77GG7C7AT77C7AAGACCC77 

A7CACGGC7CC77CT7AAC7AAACG7CCA7CCAAGC7GAA7GA7CGCGC7GAC7CAAGAA 

301 ♦ + + ♦ ♦ 360 

TAG7CCCGAGGAACAA77GA777CCAGG7AGG77CGAC77AC7AGCGCGAC7GAG77C77 

GAAGCC77TCCGACCAAGCAAACTTCCCCCTGATCATCAAGAATCT7AAGATACAACACT 

361 + ♦ ♦ ♦ ♦ 420 

CT7CCGAAACCC7GG77CC777GAAGGGGGAC7AG7AG77C77AGAA77C7A7C77C7CA 

CAGA7AC77ACA7C7G7CAAG7GGAGGACCAGAAGGAGGAGG7GCAA77GC7AG7G77CG 

421 ♦ ♦ + * ♦ 480 

G7C7A7GAA7G7ACACAC77CACC7CC7GG7C77CC7CC7CCACG77AACGA7CACAAGC 

- GA77GAC7GCCAAC7C7GACACCCACC7GCT7C 

481 + ♦ ♦ | 

C7AAC7GACGG77GAGAC7G7GGG7GGACGAAG 

40 

or a degenerate variant thereof. 

9. The method of any one of claims 1 or 7, characterized in that said DNA sequence which encodes 
A . said fragment of CD4 comprises the following DNA sequence: 
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CAATGAACCGGG 

120 

GTTACTTGGCCC 

GAG7CCCTT77AGGCAC77GC77C7GG7GCTCCAAC7GGCCC7CCTCCCAGCAGCCAC7C 

121 ♦ + + ♦ 180 

C7CAGGGAAAA7CCG7GAACGAAGACCACGACGT7GACCGCCAGGAGGG7CG7CGG7GAG 

AGGGAAAGAAAG7GG7GC7GGGCAAAAAAGGGGA7ACAC7GGAAC7CACC7G7ACACC7T 

181 --- * * - * ♦ 240 

TCCC7TTCTTTCACCACGACCCGTTTTTTCCCCTATGTCACCTTGACTGGACATGTCGAA 

CCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTCGGAA 

241 ♦ + ♦ ♦ ♦ 300 

GGGTCTTCTTCTCGTATGTTAAGGTGACC TTTTT GAGGTTGGTCTATTTCTAAGACCCTT 

ATCAGGGCTCCTTCTTAACTAAAGGTCCATCCAAGCTGAATGATCGCGC7GACTCAAGAA 

301 + ♦ ♦ + 360 

TAGTCCCGAGGAAGAATTGATTTCCAGGTAGGTTCGACTTACTAGCGCGACTGAGTTCTT 

GAAGCC777GCGACCAAGGAAAC77CCCCC7GA7CA7CAAGAA7C77AACA7AGAACAC7 

361 * ♦ + * * 420 

CTTCGGAAACCCTGGTTCCTTTGAAGGGGGACTAGTAGTTCTTAGAATTCTATCTTCTGA 

CAGATACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGACGTGCAATTGCTAGTGTTCG 

421 * ♦ + ♦ 480 

GTCTATGAATG7AGACACTTCACCTCCTGGTCTTCCTCCTCCACGTTAACGATCACAAGC 

CAT7GAC7GCCAACTC7GACACCCACCTGC77CAGGGGCAGAGCC7GACCC7CACC77GG 

481 * ♦ ♦ ♦ 540 

CTAACTGACGGTTGAGAC7GTGGGTGGACGAAGTCCCCGTCTCGGACTGGGACTGGAACC 

• AGAGCCCCCC7GG7AG7AGCCCC7CAG7GCAA7G7AGGAG7CCAAGGGG7AAAAACA7AC 

541 ♦ ♦ ♦ * 600 

7C7CGGGGGGACCA7CA7CGGCCAG7CACG77ACA7CC7CAGG77CCCCA77777G7A7G 

ACGCGGCCAACACCC7C7CCG7C7C7CAC 

601 ♦ 

7CCCCCCC77C7GGGAGAGGCACAGAG7C 



or a degenerate variant thereof. 

10. The method of claim 7. characterized in that said host produces immuno-globulin heavy chains of 
the class IgM, IgGl and lgG3 together with said fusion protein to give an immunoglobulin-like molecule 
which binds to HIV-gpl20. 
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